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(57) Abstract: The invention concerns a method of determining the presence or absence of a biological condition in humans, in 
t^* particular of colon cancer, and of determining the stage of a condition in human tissue by determining an expression pattern of a 
cell sample. Further, the invention relates to a method of determining the presence or absence of a biological condition in human 
tissue, and of determining the stage of a biological condition in human tissue, and also for reducing biological abnormalities of a cell 
suffering from the biological condition. A method for producing antibodies against an expression product of a cell from the tissue 
is also described. The invention also discloses a pharmaceutical composition for the treatment of a biological condition comprising 
at least one antibody, and a vaccine for the prophylaxis or treatment of a biological condition. Further the invention describes the 
use of a method for producing an assay for diagnosing a biological condition in human tissue, the use of a peptide or a gene or a 
probe for the preparation of a pharmaceutical composition for the treatment of a biological condition in human tissue, and an assay 
for determining the presence or absence of biological condition in human tissue and for determining an expression pattern of a cell. 
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Gene expression in biological conditions 

Technical field of the invention 

5 The present invention relates to method of determining the presence or absence of 
a biological condition in animal tissue, wherein the expression of genes in normal 
tissue and tissue from the biological condition is examined and correlated to 
standards. The invention further relates to treatment of the biological condition and 
an assay for determining the condition. 

10 

Background 

The building of large databases containing human genome sequences is the basis 
for studies of gene expressions in various tissues during normal physiological and 

15 pathologic conditions. Constantly (constitutively) expressed sequences as well as 
sequences whose expression is altered during disease processes are important for 
our understanding of cellular properties, and for the identification of candidate genes 
for future therapeutic intervention. As the number of known genes and ESTs build 
up in the databases, array-based simultaneous screening of thousands of genes is 

20 necessary to obtain a profile of transcriptional behaviour, and to identify key genes 
that either alone or in combination with other genes, control various aspects of 
cellular life. One cellular behaviour that has been a mystery for many years is the 
malignant behaviour of cancer cells. We now know that for example defects in DNA 
repair can lead to cancer but the cancer-creating mechanism in heterozygous 

25 individuals is still largely unknown as is the malignant cell's ability to repeat cell 
cycles to avoid apoptosis to escape the immune system to invade and metastasize 
and to escape therapy. There are hints and indications in these areas and excellent 
progress has been made, buth the myriad of genes interacting with each other in a 
highly complex multidimensional network is making the road to insight long and 

30 contorted. 

Similar appearing tumors - morphologically, histochemically, microscopically - can 
be profoundly different. They can have a different invasive and metastasizing 
properties, as well as respond differently to therapy. There is thus a need in the art 
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for methods which distinguish tumors and tissues on different bases than are 
currently in use in the clinic. 

The malignant transformation from normal tissue to cancer is believed to be a 
5 multistep process, in which tumorsuppressor genes, that normally repress cancer 
growth show reduced gene expression and in which other genes that encode tumor 
promoting proteins (oncogenes) show an increased expression level. Several tumor 
suppressor genes have been identified up till now, as e.g. p16, Rb, p53 ( Nesrin 
Ozoren and Wafik S. El-Deiry, Introduction to cancer genes and growth control, In: 
10 DNA alterations in cancer, genetic and epigenetic changes, Eaton publishing, 
Melanie Ehrlich (ed) p. 1-43, 2000.; and references therein). 
They are usually identified by their lack of expression or their mutation in cancer 
tissue. 

15 Other examinations have shown this downregulation of transcripts to be partly due 
to loss of genomic material ( loss of heterozygosity), partly to methylation of promo- 
torregions, and partly due to unknown factors ( Nesrin Ozoren and Wafik S. El- 
Deiry, Introduction to cancer genes and growth control, In: DNA alterations in can- 
cer, genetic and epigenetic changes, Eaton publishing, Melanie Ehrlich (ed) p. 1-43, 

20 2000.; and references therein). 

Several oncogenes are known, e.g. cyclinD1/PRAD1/BCL1 , FGFs, c-MYC, BCL-2 
all of which are genes that are amplified in cancer showing an increased level of 
transcript ( Nesrin Ozoren and Wafik S. El-Deiry, Introduction to cancer genes 
25 and growth control, In: DNA alterations in cancer, genetic and epigenetic changes, 
Eaton publishing, Melanie Ehrlich (ed) p. 1-43, 2000.; and references therein). Many 
of these genes are related to cell growth and directs the tumor cells to uninhibited 
growth. Others may be related to tissue degradation as they e.g. encode enzymes 
that break down the surrounding connective tissue. 

30 

Summary of the invention 

In one aspect the present invention relates to a method of determining the presence 
or absence of a biological condition in animal tissue comprising 

35 
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collecting a sample comprising cells from the tissue and/or expression products 
from the cells, 

assaying a first expression level of at least one gene from a first gene group, 
5 wherein the gene from the first gene group is selected from genes expressed in 

normal tissue cells in an amount higher than expression in biological condition 
cells, and/or 

assaying a second expression level of at least one gene from a second gene 
10 group, wherein the second gene group is selected from genes expressed in a 

normal tissue cells in an amount lower than expression in biological condition 
cells, 

correlating the first expression level to a standard expression level for normal 
15 tissue, and/or the second expression level to a standard expression level for 

biological condition cells to determine the presence or absence of a biological 
condition in the animal tissue. 

Animal tissue may be tissue from any animal, preferably from a mammal, such as a 
20 horse, a cow, a dog, a cat, and more preferably the tissue is human tissue. The 

biological condition may be any condition exhibiting gene expression different from 
normal tissue. In particular the biological condition relates to a malignant or prema- 
lignant condition, such as a tumor or cancer. 

25 Furthermore, the invention relates to a method of determining the stage of a bio- 
logical condition in animal tissue, 

comprising collecting a sample comprising cells from the tissue, 

30 assaying the expression of at least a first stage gene from a first stage gene 

group and at least a second stage gene from a second stage gene group, 
wherein at least one of said genes is expressed in said first stage of the condi- 
tion in a higher amount than in said second stage, and the other gene is a ex- 
pressed in said first stage of the condition in a lower amount than in said second 

35 stage of the condition, 
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correlating the expression level of the at least two genes to a standard level of 
expression determining the stage of the condition. 

5 Thereby, it is possible to determine the biological condition in more details, such as 
determination of a stage and/or a grade of a tumor. 

The methods above may be used for determining single gene expressions, however 
the invention also relates to a method of determining an expression pattern of a co- 
10 Ion cell sample, comprising: 

collecting sample comprising colon and/or rectum cells and/or expression prod- 
ucts from colon and/or rectum cells, 

15 determining the expression level of two or more genes in the sample, wherein at 

least one gene belongs to a first group of genes, said gene from the first gene 
group being expressed in a higher amount in normal tissue than in biological 
condition cells, and wherein at least one other gene belongs to a second group 
of genes, said gene from the second gene group being expressed in a lower 

20 amount in normal tissue than in biological condition cells, and the difference 

between the expression level of the first gene group in normal cells and biologi- 
cal condition cells being at least two-fold, obtaining an expression pattern of the 
colon and/or rectum cell sample. 

25 Gene expression patterns may rely on one or a few genes, but more preferred gene 
expression patterns relies on expression from multiple genes, whereby a combined 
information from several genes is obtained. 

Further, the invention relates to a method of determining an expression pattern of a 
30 colon cell sample independent of the proportion of submucosal, muscle, or connec- 
tive tissue cells present, comprising: 



35 



determining the expression of one or more genes in a sample comprising cells, 
wherein the one or more genes exclude genes which are expressed in the sub- 
mucosal, muscle, or connective tissue, whereby a pattern of expression is 
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formed for the sample which is independent of the proportion of submucosal, 
muscle, or connective tissue cells in the sample. 

The expression pattern may be used in a method according to this information, and 
5 accordingly, the invention also relates to a method of determining the presence or 
absence of a biological condition in human colon and/or rectum tissue comprising, 

collecting a sample comprising cells from the tissue, 

1 0 determining an expression pattern of the cells as defined above, 

correlating the determined expression pattern to a standard pattern, 

determining the presence or absence of the biological condition is said tissue. 

15 

as well as a method for determining the stage of a biological condition in animal tis- 
sue, comprising 

collecting a sample comprising cells from the tissue, 

20 

determining an expression pattern of the cells as defined above, 

correlating the determined expression pattern to a standard pattern, 

25 determining the stage of the biological condition is said tissue. 

The invention further relates to a method for reducing cell tumorigenicity of a cell, 
said method comprising 

30 contacting a tumor cell with at least one peptide expressed by at least one gene 
selected from genes being expressed in an amount two-fold higher in normal cells 
than the amount expressed in said tumor cell, or 



comprising 
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obtaining at least one gene selected from genes being expressed in an amount two- 
fold higher in normal cells than the amount expressed in said tumor cell, 

introducing said at least one gene into the tumor cell in a manner allowing 
5 expression of said gene(s), or 

obtaining at least one nucleotide probe capable of hybridising with at least one gene 
of a tumor cell, said at least one gene being selected from genes being expressed in 
an amount one-fold lower in normal cells than the amount expressed in said tumor 
10 cell, and 

introducing said at least one nucleotide probe into the tumor cell in a manner 
allowing the probe to hybridise to the at least one gene, thereby inhibiting 
expression of said at least one gene. 

15 

In a further aspect the invention relates to a method for producing antibodies against 
an expression product of a cell from a biological tissue, said method comprising the 
steps of 

20 obtaining expression product(s) from at least one gene said gene being expressed 
as defined above, 

immunising a mammal with said expression product(s) obtaining antibodies against 
the expression product. 

25 

The antibodies produced may be used for producing a pharmaceutical composition. 
Further, the invention relates to a vaccine capable of eliciting an immune response 
against at least one expression product from at least one gene said gene being ex- 
pressed as defined above. 

30 

The invention furthermore relates to the use of any of the methods discussed above 
for producing an assay for diagnosing a biological condition in animal tissue. 

Also, the invention relates to the use of a peptide as defined above as an expression 
35 product and/or the use of a gene as defined above and/or the use of a probe as 
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defined above for preparation of a pharmaceutical composition for the treatment of a 
biological condition in animal tissue. 

In a yet further aspect the invention relates to an assay for determining the presence 
5 or absence of a biological condition in animal tissue, comprising 

at least one first marker capable of detecting a first expression level of at least 
one gene from a first gene group, wherein the gene from the first gene group is 
selected from genes expressed in normal tissue cells in an amount higher than 
10 expression in biological condition cells, 

at least one second marker capable of detecting a second expression level of at 
least one gene from a second gene group, wherein the second gene group is 
selected from genes expressed in normal tissue cells in an amount lower than 
15 expression in biological condition cells. 

In another aspect the invention relates to an assay for determining an expression 
pattern of a colon and/or rectum cell, comprising at least a first marker and a second 
marker, wherein the first marker is capable of detecting a gene from a first gene 
20 group as defined above, and the second marker is capable of detecting a gene from 
a second gene group as defined above. 

Detailed description of the invention 

25 Samples 

The samples according to the present invention may be any tissue sample, it is 
however often preferred to conduct the methods according to the invention on 
epithelial tissue, such as epithelial tissue from the gastro-intestinal tract, in particular 
30 form colon and/or rectum. In particular the epithelial tissue may be mucosa. 

The sample may be obtained by any suitable manner known to the man skilled in 
the art, such as a biopsy of the tissue, or a superficial sample scraped from the tis- 
sue. The sample may be prepared by forming a cell suspension made from the tis- 
35 sue, or by obtaining an extract from the tissue. 
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In one embodiment it is preferred that the sample comprises substantially only cells 
from said tissue, such as substantially only cells from mucosa of the colon-rectum. 

5 Biological condition 

The methods according to the invention may be used for determining any biological 
condition, wherein said condition leads to a change in the expression of at least one 
gene, and preferably a change in a variety of genes. 

10 

Thus, the biological condition may be any malignant or premalignant condition, in 
particular in colon/rectum, such as an adenocarcinoma, a carcinoma, a teratoma, a 
sarcoma, and/or a lymphoma. 

15 In relation to the gastro-intestinal tract, the biological condition may also be colitis 
ulcerosa, Mb. Crohn, diverticulitis, adenomas. 

Single gene expression contra expression pattern 

20 The expression level may be determined as single gene approaches, i.e. wherein 
the determination of expression from one or two or a few genes is conducted. It is 
preferred that expression from at least one gene from a first (normal) group is de- 
termined, said first gene group representing genes being expressed at a higher level 
in normal tissue, i.e. so-called suppressors, in combination with determination of 

25 expression of at least one gene from a second group, said second group represent- 
ing genes being expressed at a higher level in tissue from the biological condition 
than in normal tissue, ie. so-called oncogenes. However, determination of the ex- 
pression of a single gene whether belonging to the first group or second group is 
within the scope of the present invention. In this case it is preferred that the single 

30 gene is selected among genes having a very high change in expression level from 
normal cells to biological condition cells. 

Another approach is determination of an expression pattern from a variety of genes, 
wherein the determination of the biological condition in the tissue relies on informa- 
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tion from a variety of gene expression, i.e. rather on the combination of expressed 
genes than on the information from single genes. 

Colorectal tumors 

5 

The following data presented herein relates to colorectal tumors, and therefore the 
description has focused on the gene expression level as one way of identifying 
genes that lose function in cancer tissue. Genes showing a remarkable downregula- 
tion (or complete loss) of the expression level - measured as the mRNA transcript, 
10 during the malignant progression in colon from normal mucosa through Dukes A 

superficial tumors to Dukes B, slightly invasive tumors, to Dukes C that have spread 
to lymphnodes and finally to Dukes D that have metastasized to other organs, has 
been examined, as well as genes gaining importance during the differentiation to- 
wards malignancy. 

15 

Gene groups 

The present invention relates to a variety of genes identified either by an EST identi- 
fication number and/or by a gene identification number. Both type of identification 
20 numbers relates to identification numbers of UniGene database, NCBI, build 18. 

The various genes have been identified using Affymetrix arrays of the following 
product numbers: 

25 Human Gene FL array 900 1 83 
HU35K SubA 900 184 
HU35K SubB 900 185 
HU35K SubC 900 186 
HU35K SubD 900 187 

30 

First gene group 

The first gene group relates to genes being expressed in normal tissue cells in an 
amount higher than expression in biological condition cells. The term "normal tissue 
35 cells" relates to cells from the same type of tissue that is examined with respect to 
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the biological condition in question. Thus, with respect to colorectal tumors, the 
normal tissue relates to colorectal tissue, in particular to colorectal mucosa. 

The first gene group therefore relates to genes being downregulated in tumors, such 
5 genes being expected to serve as tumor suppressor genes, and they are of impor- 
tance as predictive markers for the disease as loss of one or more of these may 
signal a poor outcome or an aggressive disease course. Furthermore, they may be 
important targets for therapy as restoring their expression level, e.g. by gene ther- 
apy, may suppress the malignant growth. 

10 



For a colorectal tissue sample a gene from the first gene group is preferably se- 
lected individually from genes comprising a sequence as identified below by EST 



UniGene number 


Homologous to 


RC_H04768_at 




chrom 15 no homology 


RC_Z39652_at 




Y14593 APM-1 gene adipocyte-specific se- 
cretory protein; chrom 1q21. 3-q23 


RC_H30270_at 




chrom 18 PAAAA in colon & bladder no 
homology 


RC_T47089_s_at 




tenascin-X; tenascin-X precursor; unidenti- 
fied protein 


RC_W31906_at 




secretagogin; dJ501N12.8 (putative protein) 
chrom 6 


RC_AA279803_at 




chrom 2 no homology 


RClR01646*at;^T 




chrom 13q32. 1-33.3 ; AL159152 ; homolo- 
gy to mouse Pcbpl - poly(rC)-binding 
protein 1 


RC_AA099820_at 




BAC clone AC01 6778 


AA319615_at 




secretory carrier membrane protein; secre- 
tory carrier membrane protein 2; chrom 15 


H07011_at 




tetraspan NET-6 mRNA; transmembrane 
4 superfamily; chrom 7 


RC_T68873_f_at 






RC_T40995_f_at 






RC_H81070_f_at 






RC_N30796_at 






RC_W37778_f_at 






RC_R70212_s_at 






RC_AA426330_at 






RC_N33927_s_at 






RC_T90190_s_at 






RC_AA447145_at 






RC_H75860_at 






RC_T71132_s_at 
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and from genes comprising a sequence as identified below 



"Human chromogranin A " M mRNA," " complete cds" J03915 

Human adipsin/complement factor D "mRNA," comple- M84526 
te cds 

Homo sapiens MLC-1 V/Sb isoform gene M24248 

Human aminopeptidase N/CD13 mRNA encoding M22324 
aminopeptidase "IM," complete cds 

H.sapiens MT-11 mRNA X76717 

H.sapiens GCAP-II gene Z70295 

Human somatostatin I gene and flanks J00306 

Human YMP "mRNA," complete cds U521 01 

H.sapiens mRNA for beta subunit of epithelial amiloride- X87159 
sensitive sodium channel 

Human K12 protein precursor "mRNA," complete cds U77643 

Human sulfate transporter (DTD) "mRNA," complete cds U14528 

Human transcription factor hGATA-6 "mRNA," complete U66075 
cds. 

H.sapiens SCAD "gene," exon 1 and joining features Z80345 

Human S-lac lectin L-14-II (LGALS2) gene M87860 

Human mRNA for protein tyrosine phosphatase D1 5049 

H.sapiens mRNA for tetranectin X64559 

Human 1 1 kd protein "mRNA/ 1 complete cds U28249 

Human anti-mullerian hormone type II receptor precursor U29700 
"gene," complete cds 

Human heparin binding protein (HBp17) "mRNA," complete M60047 
cds 

Human ADP-ribosylation factor (hARF6) "mRNA," complete M57763 
cds 

beta -ADD=adducin beta subunit 63 kda isoform/membrane S81083 
skeleton protein, beta -ADD=adducin beta subunit 63 kda 
isoform/membrane skeleton protein {alternatively spliced, 
exon 10 to 13 region} [human, Genomic, 1851 nt, segment 
3 of 3]. 

Zinc Finger Protein Znf 1 55 HG4243- 

HT4513 

Human glucagon "mRNA," complete cds J04040 

H.sapiens mRNA for hair "keratin," hHb5 X99140 

Human tubulin-folding cofactor E "mRNA," complete cds U61232 

Human integrin alpha-3 chain "mRNA," complete cds M5991 1 

Human NACP gene U46901 

H.sapiens mRNA for flavin-containing monooxygenase 5 Z47553 
(FMOS) 

Human mRNA for ATF-a transcription factor X52943 

H.sapiens intestinal VIP receptor related protein mRNA X77777 



and and from genes comprising a sequence as identified below 

AF001548 

Homo sapiens chromosome 16 BAC clone CIT987SK- 
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815A9 complete sequence. 

Human mRNA for ATP synthase alpha "subunit," complete D14710 
cds 

Human mRNA for IgG Fc binding "protein," complete D84239 
cds 

H.sapiens mRNA for carcinoembryonic "antigen," X9831 1 
CGM2 

"Homo sapiens (clone lamda-hPEC-3) phosphoenolpy- L05144 

ruvate carboxykinase (PCK1) ""mRNA,"" complete 

cds" 

Human 1 1 -beta-hydroxysteroid dehydrogenase type 2 U26726 
"mRNA," complete cds 

"Human intestinal mucin (MUC2) ""mRNA,"" complete cds" L21998 
Human mRNA for KIAA01 06 "gene," complete cds D1 4662 

metallothionein V00594 
Human mRNA for IgG Fc binding "protein," complete D84239 
cds 

H.sapiens mRNA for carcinoembryonic "antigen," X9831 1 
CGM2 

"Homo sapiens (clone lamda-hPEC-3) phosphoenolpy- L05144 

ruvate carboxykinase (PCK1) ""mRNA,"" complete 

cds" 

metallothionein V00594 



In a preferred embodiment a gene from the first gene group is preferably selected 
individually from genes comprising a sequence as identified below by EST 

5 



UniGene number 


Homologous to 


RC_H04768_at 




chrom 15 no homology 


RC_Z39652_at 




Y 14593 APM-1 gene adipocyte-specific se- 
cretory protein; chrom 1q21. 3-q23 


RC_H30270_at 




chrom 18 PAAAA in colon & bladder no 
homology 


RC_AA279803_at 




chrom 2 no homology 


RC_R01646_at - 




chrom 13q32. 1-33.3 ; AL159152 ; homolo- 
gy to mouse Pcbpl - poly(rC)-binding 
protein 1 


RC_AA099820_at 




BAC clone AC01 6778 



and from genes comprising a sequence as identified below 

10 

"Human chromogranin A ""mRNA,"" complete cds" J03915 
Human adipsin/complement factor D "mRNA," comple- M84526 
te cds 

Homo sapiens MLC-1 V/Sb isoform gene M24248 
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Human aminopeptidase N/CD13 mRNA encoding M22324 
aminopeptidase "N," complete cds 

H.sapiens MT-11 mRNA X76717 

H.sapiens GCAP-II gene Z70295 

Human somatostatin I gene and flanks J00306 

or selected individually from genes comprising a sequence as identified below by 
EST 



5 



UniGene number 


Homologous to 


RC H04768 at 




chrom 15 no homology 


RC Z39652at i 




Y 14593 APM-1 gene adipocyte-specific se- 
cretory protein; chrom 1q21. 3-q23 


RC_H30270_at 




chrom 18 PAAAA in colon & bladder no 
homology 


RC_T47089_s_at 




tenascin-X; tenascin-X precursor; unidenti- 
fied protein 


RC_W31906_at 




secretagogin; dJ501N12.8 (putative protein) 
chrom 6 


RC_AA279803_at 




chrom 2 no homology 


RC R01646 at £ ^ S i 

e-, i -jr- ■ -■■ — ;. ; y / 

\ ' ■ % : '.~ " X> ,-' U * : '' * • ' "-' •• ; • v ' * * 




chrom 13q32.1-33.3 ; AL159152 ; homolo- 
gy to mouse Pcbpl - poly(rC)-binding 
protein 1 


RC„AA099820_at 




BAC clone AC0 16778 


AA319615_at 




secretory carrier membrane protein; secre- 
tory carrier membrane protein 2; chrom 15 


H07011_at 




tetraspan NET-6 mRNA; transmembrane 
4 superfamily; chrom 7 



In a more preferred embodiment a gene from the first gene group is selected indi 
vidually from genes comprising a sequence as identified below by EST 



UniGene number 


Homologous to 


RC_H04768_at 




chrom 15 no homology 


RCiZ39652Iat v 




Y 14593 APM-1 gene adipocyte-specific se- 
cretory protein; chrom 1q21.3-q23 


RC_H30270_at 




chrom 18 PAAAA in colon & bladder no 
homology 


RC_T47089_s_at 




tenascin-X; tenascin-X precursor; unidenti- 
fied protein 


RC_W31906_at 




secretagogin; dJ501N12.8 (putative protein) 
chrom 6 


RC_AA279803_at 




chrom 2 no homology 


RC_R01646_at 




chrom 13q32. 1-33.3 ; AL 159152 ; homolo- 
gy to mouse Pcbpl - poly(rC)-binding 
protein 1 
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AA319615_at 



secretory carrier membrane protein; secre- 
tory carrier membrane protein 2; chrom 15 



In a most preferred embodiment a gene from the first gene group is selected indi- 
vidually from genes comprising a sequence as identified below by EST 

UniGene number Homologous to 



RC_T47089_s_at 




tenascin-X; tenascin-X precursor; unidenti- 
fied protein 


RC_W31906_at 




secretagogin; dJ501N12.8 (putative protein) 
chrom 6 


RC_AA279803_at 




chrom 2 no homology 


AA319615_at 




secretory carrier membrane protein; secre- 
tory carrier membrane protein 2; chrom 15 



Second gene group 

10 

We have determined genes that are up-regulated (or gained de novo) during the 
malignant progression of colorectal cancer from normal tissue through Dukes A,B,C 
and to Dukes D. These genes are potential oncogenes and may be those genes that 
create or enhance the malignant growth of the cells. The expression level of these 

15 genes may serve as predictive markers for the disease course, as a high level may 
signal an aggressive disease course, and they may serve as targets for therapy, as 
blocking these genes by e.g. anti-sense therapy, or by biochemical means could 
inhibit, or slow, the tumor growth. Such up-regulated (or gained de novo) genes, 
oncogenes, may be classified according to the present invention as genes belonging 

20 to second genes group. 



With respect to colorectal tumors genes belonging to the second gene group are 
preferably selected individually from genes comprising a sequence as identified be- 
low by EST 

25 

UniGene number Homologous to 



RC_AA609013_s_at 




microsomal dipeptidase (also on 6.8k); 
chrom 16 


RC_AA232508_at 




CGI-89 protein; unnamed protein product; 
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hypothetical protein 


RC_AA428964_at 




serine protease-like protease; serine pro- 
tease homolog=NES1 ; normal epithelial cell- 
specific 1 


RC_T52813_s_at 




dJ28O10.2 (G0S2 (PUTATIVE LYMPHO- 
CYTE G0/G1 SWITCH PROTEIN 2; chrom 
1 


RC AA075642 at 




gp-340 variant protein; DMBT1/8kb.2 protein 


RC_AA007218_at 




chrom 13 no homology 


RC_N33920_at 




ubiquitin-like protein FAT10; diubiquitin; 
dJ271M21.6 (Diubiquitin); chrom 6 


RC N71781 at 




KIAA1 199 protein, chrom 15 


RC_R67275_S_at 




alpha-1 (type XI) collagen precursor; colla- 
gen, type XI, alpha 1 ; collagen type XI alp- 
ha-1 isoform A; chrom 1 


RC_W80763_at 




hypothetical protein; chrom 17 


RC AA443793 at 




chrom 7p22 AC006028 BAC clone 


RC_AA034499_s_at 




ZNF198 protein; zinc finger protein; FIM 
protein; Cys-rich protein; zinc finger protein 
198; chrom 13 


RC_AA035482_at 




chrom 5; AK022505 clone; CalcineurinB 
(weakly similar) 


RC_AA024482_at 




hypothetical protein; unnamed protein pro- 
duct; chrom 17 


RC_H93021_at 




chrom 2 ; XM_004890 peptidylprolyl isome- 
rase A (cyclophilin A) 


RC_AA427737_at 




no homology 


RC AA417078 at 




chrom 7q31; AF017104 clone 


M29873_s_at 




cytochrome P450-IIB (hllB3) ; 19q13.1- 
q13.2 


RC H27498 f at 






RC_T92363_s_at 






RC_N89910_at 






RC_W60516_at 






RC_AA219699_at 






RC_AA449450_at 







and from genes comprising a sequence as identified below 

Homo sapiens (clones "MDP4," MDP7) microsomal J05257 

dipeptidase (MDP) "mRNA," complete cds 

"Homo sapiens reg gene ""homologue,"" complete L08010 

cds" 

H. sapiens mRNA for prepro-alpha2(l) collagen Z74616 

"Human S-adenosylhomocysteine hydrolase (AHCY) M61832 
,,,l mRNA, ,M, complete cds" 

Transcription Factor liia HG4312- 

HT4582 

Human gene for melanoma growth stimulatory activity X54489 
(MGSA) 

Human stromelysin-3 mRNA X57766 
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CDC25Hu2=cdc25+ homolog "[human," "mRNA," 3118 nt] 
Human mRNA for cripto protein 

Human transformation-sensitive protein (IEF SSP 3521) 
"mRNA," complete cds 

Human complement component 2 (C2) gene allele b 

H.sapiens mRNA for ITBA2 protein 

H.sapiens encoding CLA-1 mRNA 

"Human fibroblast growth factor receptor 4 (FGFR4) 

""mRNA,"" complete cds" 

Fibronectin,"" Alt. Splice 1" 

tyk2 

Human mRNA for B-myb gene 

"Human phosphofructokinase (PFKM) ""mRNA,"" complete 
cds" 

Human pre-B cell enhancing factor (PBEF) "mRNA," com- 
plete cds 

Human SH2-containing inositol 5-phosphatase (hSHIP) 
"mRNA," complete cds 

Human interleukin 8 (IL8) "gene," complete cds 

"Human lamin B receptor (LBR) ""mRNA,"" complete cds" 
H.sapiens mRNA for protein tyrosine phosphatase 
Human mRNA for unc-18 "homologue," complete cds 
H.sapiens mRNA for Zn-alpha2-glycoprotein 

"Human asparagine synthetase ""mRNA,"" complete cds" 
Human hepatitis delta antigen interacting protein A (dipA) 
"mRNA," complete cds 

Human splicesomal protein (SAP 61) "mRNA," complete 
cds 

Human protein kinase C-binding protein RACK7 "mRNA," 
partial cds 

Human MAC30 "mRNA," 3' end 

Human thrombospondin 2 (THBS2) "mRNA," complete cds 
"Human nicotinamide N-methyltransferase (NNMT) 
""mRNA,"" complete cds" 
H.sapiens mRNA for type I interstitial collagenase 
Human cytochrome b561 gene 

Human H19 RNA "gene," complete cds (spliced in sili- 
co) 

Human collagen type XVIII alpha 1 (COL18A1) "mRNA," 
partial cds 



S78187 
X 14253 
M86752 

L09708 
X92896 
Z22555 
L03840 

HG3044- 

HT3742 

X54667 

X13293 

U24183 

U02020 

U57650 

M28130 

L25931 

Z48541 

D63851 

X59766 

Z25521 

M27396 

U63825 

U08815 

U48251 

L19183 
L12350 
U08021 

X54925 
U29463 
M32053 

L22548 

U79274 



Human transforming growth factor-beta induced gene pro- 
duct (BIGH3) "mRNA," complete cds 


M77349 


"Human breast epithelial antigen BA46 ""mRNA,"" com- 
plete cds" 


U58516 




X57351 


H.sapiens NGAL gene 


X99133 


Human mRNA for MDNCF (monocyte-derived neutrop- 
hil chemotactic factor) 


Y00787 


H.sapiens EF-1 delta gene encoding human elongation 


Z21507 
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factor-1 -delta 




H.sapiens mRNA for prepro-alphal (1) collagen 


Z74615 


Nuclear Factor Nf-116 


HG3494- 
HT3688 




U29175 


"HNL=neutrophil lipocalin ""[human,"" ovarian cancer 
cell line ""OC6,"" mRNA ""Partial,"" 534 nt]. 
/gb=S75256 /ntype=RNA" 


S75256 



In a preferred embodiment the genes belonging to the second gene group are pref- 
erably selected individually from genes comprising a sequence as identified below 
5 by EST 



UniGene number 


Homologous to 


RC_AA007218_at 




chrom 13 no homology 


RC AA443793 at 




chrom 7p22 AC006028 BAC clone 


RC_AA035482_at \ 




chrom 5; AK022505 clone; CalcineurinB 
(weakly similar) 


RC_H93021_at 




chrom 2 ; XM_004890 peptidylproiyl isome- 
rase A (cyclophilin A) 


RC_AA427737_at 




no homology 


RC_AA417078_at 




chrom 7q31; AF0 17104 clone 



10 and from genes comprising a sequence as identified below 

In another preferred embodiment genes from the second gene group are selected 
individually from genes comprising a sequence as identified below 



UniGene number 


Homologous to 


RC_AA609013_s_at 




microsomal dipeptidase (also on 6.8k); 
chrom 16 


RC_AA232508_at 




CGI-89 protein; unnamed protein product; 
hypothetical protein 


RC_AA428964_at 




serine protease-like protease; serine pro- 
tease homolog=NES1; normal epithelial cell- 
specific 1 


RC_T52813_s_at 




dJ28O10.2 (G0S2 (PUTATIVE LYMPHO- 
CYTE G0/G1 SWITCH PROTEIN 2; chrom 
1 


RC_AA075642_at 




gp-340 variant protein; DMBT1/8kb.2 protein 
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RC AA007218_at 




chrom 13 no homology 


RC_N33920_at 




ubiquitin-like protein FAT10; diubiquitin; 
dJ271M21.6 (Diubiquitin); chrom 6 


RC_N71781_at 




KIAA1199 protein, chrom 15 


RC_R67275_s_at 




alpha-1 (type XI) collagen precursor; colla- 
gen, type XI, alpha 1 ; collagen type XI alp- 
ha-1 isoform A; chrom 1 


RC_W80763_at 




hypothetical protein; chrom 17 


RC AA443793 at 




chrom 7p22 AC006028 BAC clone 


RC_AA034499_s_at 




ZNF198 protein; zinc finger protein; FIM 
protein; Cys-rich protein; zinc finger protein 
198; chrom 13 


RC_AA035482_a.lt ^ , s ; 




chrom 5; AK022505 clone; CalcineurinB 
(weakly similar) 


RC_AA024482_at 




hypothetical protein; unnamed protein pro- 
duct; chrom 17 


RC_H93021_at 




chrom 2 ; XM_004890 peptidylprolyl isome- 
rase A (cyclophilin A) 


RC_AA427737_at 




no homology 


RC_AA417078_at 




chrom 7q31; AF0171O4 clone 


M29873_s_at 




Cytochrome P450-IIB (hllB3) ; 19q13.1- 
q13.2 



In a more preferred embodiment genes from the second gene group are selected 
individually from genes comprising a sequence as identified below 



5 UniGene number Homologous to 



RC_AA609013_S_at 




microsomal dipeptidase (also on 6.8k); 
chrom 16 


RC_AA232508_at 




CGI-89 protein; unnamed protein product; 
hypothetical protein 


RC_AA428964_at 




serine protease-like protease; serine pro- 
tease homolog=NES1; normal epithelial cell- 
specific 1 


RC_AA075642_at 




gp-340 variant protein; DMBT1/8kb.2 protein 


RC_AA007218_at 




chrom 13 no homology 


RC_N33920_at 




ubiquitin-like protein FAT10; diubiquitin; 
dJ271M21.6 (Diubiquitin); chrom 6 


RC_N71781_at 




KIAA1199 protein, chrom 15 


RC_R67275_s_at 




alpha-1 (type XI) collagen precursor; colla- 
gen, type XI, alpha 1 ; collagen type XI alp- 
ha-1 isoform A; chrom 1 


RC_W80763_at 




hypothetical protein; chrom 17 


RC_AA034499_s_at 




ZNF198 protein; zinc finger protein; FIM 
protein; Cys-rich protein; zinc finger protein 
198; chrom 13 


RC_AA035482_at 




chrom 5; AK022505 clone; CalcineurinB 
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(weakly similar) 


RC_AA024482_at 




hypothetical protein; unnamed protein pro- 
duct; chrom 17 


RC_H93021_at 




chrom 2 ; XM_004890 peptidylprolyl isome- 
rase A (cyclophilin A) 


RC_AA427737_at 




no homology 


RC_AA417078_at 




chrom 7q31; AF017104 clone 


M29873_s_at 




cytochrome P450-IIB (hllB3) ; 19q13.1- 
q13.2 



In an even more preferred embodiment genes from the second gene group are se- 
lected individually from genes comprising a sequence as identified below 



UniGene number 


Homologous to 


RC_AA609013_s_at 




microsomal dipeptidase (also on 6.8k); 
chrom 16 


RC_AA007218_at 




chrom 13 no homology 


RC_AA035482_at 




chrom 5; AK022505 clone; CalcineurinB 
(weakly similar) 


RQiH93q2lLat; ; f |:f ^ 




chrom 2 ; XM_004890 peptidylprolyl isome- 
rase A (cyclophilin A) 


RC AA427737 at 




no homology 


RC_AA417078_at 




chrom 7q31; AF017104 clone 



such as a sequence as identified below 



1 0 UniGene number Homologous to 



RC_W80763_at 



hypothetical protein; chrom 17 



15 



The genes from the second gene group discussed above are preferably genes be- 
ing expressed in all stages of the biological condition, such as all Dukes stages of a 
colorectal tumor, to be used for determining the biological condition. 



Number of genes 



As discussed above, it is possible to use a single gene approach determining the 
20 expression of one of the genes only, in order to determine the biological condition of 
the tissue. This is particularly relevant for genes mentioned in the tables in Experi- 
ments, since these genes have been determined as having a strong indicativity per 
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gene. It is however preferred that expression from at least one gene from the first 
group as well as expression from one gene from the second group is determined to 
obtain a more statistically significant result, that is more independent of the expres- 
sion level of the individual gene. In a preferred embodiment expression from more 
5 genes from both groups are determined, such as determination of expression from 
at least two genes from either of the gene groups, such as determination of expres- 
sion from at least three genes from either of the gene groups, such as determination 
of expression from at least four genes from either of the gene groups, such as de- 
termination of expression from at least five genes from either of the gene groups, 
10 such as determination of expression from at least six genes from either of the gene 
groups, such as determination of expression from at least seven genes from either 
of the gene groups. 

A pattern of characteristic expression of one gene can be useful in characterizing a 
15 cell type source or a stage of disease. However, more genes may be usefully 
analyzed. Useful patterns include expression of at least one, two, three, five, ten, 
fifteen, twenty, twenty-five, fifty, seventy-five, one hundred or several hundred 
informative genes. 

20 Expression level 

Using the results provided in the accompanying figures and tables, a gene is 
indicated as being expressed if an intensity value of greater than or equal to 20 is 
shown. Conversely, an intensity value of less than 20 indicates that the gene is not 

25 expressed above background levels. Comparison of an expression pattern to 
another may score a change from expressed to non-expressed, or the reverse. 
Alternatively, changes in intensity of expression may be scored, either increases or 
decreases. Any statistically significant change can be used. Typically changes which 
are greater than 2-fold are suitable. Changes which are greater than 5-fold are 

30 highly significant. 

The present invention in particular relates to methods using genes wherein the ratio 
of the expression level in normal tissue to biological condition tissue for suppressor 
genes or vice versa of the expression level in biological condition tissue to normal 
35 tissue for condition genes is as high as possible, such as at least two-fold change in 



WO 01/49879 



PCT/DK00/00744 



21 

expression, such as at least three-fold, such as at least four fold, such as at least 
five fold, such as at least six fold, such as at least ten fold, such as at least fifteen 
fold, such as at least twenty fold. 

5 Stages and grades 

Stage of a colorectal tumor indicates how deep the tumor has penetrated. 
Superficial tumors are termed Dukes A and Dukes B and Dukes C are used to 
describe increasing degrees of penetration into the muscle. The grade of a 
10 colorectal tumor is expressed on a scale of l-IV (1-4). The grade reflects the 
cytological appearance of the cells. Grade I cells are almost normal. Grade II cells 
are slightly deviant. Grade III cells are clearly abnormal. And Grade IV cells are 
highly abnormal. 

15 It is important to classify the stage of a cancer disease, as superficial tumors may 
require a less intensive treatment than invasive tumors. We have therefore used the 
expression level of genes to identify genes whose expression can be used to iden- 
tify a certain stage of the disease. We have divided these "Classifiers" into those 
which can be used to identify Dukes A, B, C, and D stages. We expect that meas- 

20 uring the transcript level of one or more of these genes will lead to a classifier that 
can add supplementary information to the information obtained from the pathological 
Dukes classification. For example we believe that gene expression levels that signify 
a Dukes C will be unfavourable to detect in a Dukes A tumor, as they may signal 
that the Dukes A tumor has the potential to become a Dukes C tumor. The opposite 

25 is probably also true, that an expression level that signify Dukes A will be favorable 
to have in a Dukes C tumor. In that way independent information may be obtained 
from Dukes pathological classification and a classification based on gene expres- 
sion levels is made. 

30 Thus, in one embodiment the invention relates to a method as described above fur- 
ther comprising the steps of determining the stage of a biological condition in the 
animal tissue, comprising assaying a third expression level of at least one gene from 
a third gene group, wherein a gene from said second gene group, in one stage, is 
expressed differently from a gene from said third gene group. 



35 
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In another aspect the invention relates to method of determining the stage of a bio- 
logical condition in animal tissue, 

comprising collecting a sample comprising cells from the tissue, 

5 

assaying the expression of at least a first stage gene from a first stage gene 
group and/or at least a second stage gene from a second stage gene group, 
wherein at least one of said genes is expressed in said first stage of the condi- 
tion in a higher amount than in said second stage, and the other gene is a ex- 
1 0 pressed in said first stage of the condition in a lower amount than in said second 

stage of the condition, 

correlating the expression level of the assessed genes to a standard level of ex- 
pression determining the stage of the condition. 

15 

The method of determining the stage of a tumor may be combined with determina- 
tion of the biological condition or may be an independent method as such. The dif- 
ference in expression level of a gene from one stage to the expression level of the 
gene in another group is preferably at least two-fold, such as at least three-fold. 

20 

Thus, the invention relates to a method of determining the stage of a colorectal tu- 
mor, wherein the stage is selected from colon cancer stages Dukes A, Dukes B, 
Dukes C, and Dukes D, comprising assaying at least the expression of Dukes A 
stage gene from a Dukes A stage gene group, at least one Dukes B stage gene 
25 from a Dukes B stage gene group, at least the expression of Dukes C stage gene 
from a Dukes C stage gene group, and/or at least one Dukes D stage gene from a 
Dukes D stage gene group, wherein at least one gene from each gene group is ex- 
pressed in a significantly different amount in that stage than in one of the other 
stages. 

30 

The genes selected may be a gene from each gene group being expressed in a 
significantly higher amount in that stage than in one of the other stages, such as: 



35 



a Dukes A stage gene selected individually from any gene comprising a sequence 
as identified below as EST 
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RC AA599199 at 




ALU seq. 


RC_R12694_at 




unnamed protein product 
BAA91641, chrom 10 


RC__H91325_s_at 




aldolase B; aldolase B (aa 1- 
364); chrom 9 


RC N51709 at 




chrom X 


RC N72610 iit 






RC_N69263_at 




chrom 10; AK026414 clone 
(only 108 nt horn) 


RC_T15817_f_at 




iNOS, inducible nitric oxide 
synthase 



RC_F03077_f chromosome 17, clone 

hRPC.15 
RC_AA599199 Alu seq 

RC_AA20701 5 clone RP4-733M1 6 on chromo- 
some 1p36. 11-36.23 

RC_AA23491 6 chromosome 1 9 clone CTC- 
461 H2 

RC_N92239_a Wnt inhibitory factor-1 (WIF-1), 

chromosome 12 
RC_N93958_s phospholipase A2, group X 

(PLA2G10), 

U95301_at phospholipase A2, group X 

(PLA2G10), 
RC_AA426330 chromosome 17, clone 

hRPC.1110_E_20 
RC_AA024658 clone SCb-254N2 

(UWGC:rg254N02) from 6p21 
RC_H88540_a heat shock protein 90, 1 q21 .2- 

q22 



or any gene comprising a sequence as identified below 



D87444_at 
U18291_at 
L76568_xpt3_f_at 



U45328_s_at 

Z14982_rna1_at 

AD000092_cds7_s 
at 



D86973_at 
X81636_at 



Human mRNA for KIAA0255 "gene," complete cds 
Human CDC16HS "mRNA," complete cds 
S26 from Homo sapiens excision and cross link repair protein 
(ERCC4) "gene," complete genomic sequence. /gb=L76568 
/ntype=DNA /annot=exon 

"Human ubiquitin-conjugating enzyme (UBE2I) ""mRNA,"" complete 
cds" 

H. sapiens gene for major histocompatibility complex encoded protea- 
some subunit LMP7. 

RAD23A gene (human RAD23A homolog) extracted from Homo 

sapiens DNA from chromosome 19p13.2 cosmids "R31240," R30272 

and R28549 containing the "EKLF," "GCDH," "CRTC," and RAD23A 

"genes," genomic sequence 

Human mRNA for KIAA0219 "gene," partial cds 

H.sapiens clathrin light chain a gene 
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M59916_at 
X85781_s_at 

M57731_S_at 

U49188_at 

X53800_s_at 

U56816_at 

HG1067- 

HT1067 r at 



Human acid sphingomyelinase (ASM) "mRNA," complete cds 
"H.sapiens NOS2 ""gene,"" exon 27/gb=X85781 /ntype=DNA 
/annot=exon" 

"Human gro-beta ""mRNA, "" complete cds" 

Human placenta (Diff33) "mRNA," complete cds 

Human mRNA for macrophage inflammatory protein-2beta (MIP2beta) 

Human kinase Myt1 (Myt1) "mRNA," complete cds. 

Mucin (Gb:M22406) 



numan migration inniDitory idcior-reidtea pruieiM o uvinnoj 

" n o n o 11 rnmnloto pHc 

ytJiifc?, uunipiuit? tub 


M91 ClOR 


Human acyloxyacyl hydrolase "mRNA," complete cds 


M62840 


numan rtr iy (ror^j rnniNA, compieie oub 




1— 1 oininnp Uli imin mDM A 

n. sapiens numig rnriiNM 


XX / £. 1 JO 


UI poniono DICQI DC m QMA 

n. sapiens rlooLnt mrUNA 


A. / OOtt 


n. sapiens mrtNA tot iwisi protein, partial, /go— y i i lou 
/ntype=RNA 


V1 1 1 AH 
Till OVJ 


numan mHNA Tor i vjar-ueta supenamny proiem, com- 
plete cds 


ARnnriRR/i 

MDUUUOO^f 


numan mHNA tor iviooi , compiexe cos 


U I I w+ 


Human complement Tacior d mniNA, complete cos 


I 1 W7C\0 
l_ I D / Kjc. 


|||_|nmn canionc (OTP hinHinn nrntoin /PA RO\ " " m R M A " 11 

nomo Sapiens o i r-uinuing protein ^nMD^j nimN/A, 
complete cos 


M9ft21 ^ 
IVI^O^ I o 


Unmon tronclotinnal initiatirin f a r»tr^ r 9 hota qi i hi I n it / o 1 F_9- 
nUiildil li di loldllUI ldl liilUdlKJM IdUliJi UtJld ouuui ill \cJi i 

hot^A "mRMA " nomnlatp rrlQ 

UtUd/ IIHilN/"\, UUIIIfJiClU ouo 




Human F1R "mRNA " pnmnlpfp nrte 


M80244 


IEX-1=radiation-inducible immediate-early gene "[human," 

"nlarpnta " mRNA "Partial " 1223 ntl 


S81914 


Human nnC1fiH<? "mRNA " nomnlete cds 


U18291 


Human DD96 "mRNA," complete cds 


U21049 


Human (mamr\ "mRNA " JTR /nh— mOQQQ /ntvnp— RNA 




"Human ubiquitin-conjugating enzyme (UBE2I) ""mRNA,"" 

CO[Tipit?Lc tub 


U45328 


"Human fetal brain glycogen phosphorylase B ""mRNA,"" 

coriipicie OUb 


U47025 


"Human RTG2 (BTG2 1 * ""mRNA "" comolete cds" 


U72649 


Human jun-B mRNA for JUN-B protein 


X51345 


Human chaperonin 10 "mRNA," complete cds 


U07550 


H.sapiens RING4 cDNA 


X57522 


H.sapiens genes TAP1, TAP2, LMP2, LMP7 and DOB. 


X66401 


H.sapiens mRNA for alpha 4 protein 


Y08915 


Homo sapiens interleukin-1 receptor-associated kinase 
(IRAK) "mRNA," complete cds 


L76191 


"Human von Willebrand factor ""mRNA,"" 3' end" 


M 10321 


Human chromosome segregation gene homolog CAS 
"mRNA," complete cds 


U33286 


Human Bruton's tyrosine kinase-associated protein-135 
"mRNA," complete cds. 


U77948 


"Human KH type splicing regulatory protein KSRP 
""mRNA,"" complete cds." 


U94832 


H.sapiens ADE2H1 mRNA showing homologies to SAICAR 


X53793 
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synthetase and AIR carboxylase of the purine pathway (EC 
"6.3.2,6," EC 4.1.1.21) 



a Dukes B stage gene is selected individually from any gene comprising a sequence 
as identified below 



RC_T67463_s_at 




cathepsin 02; X; K 


RC_W94688_at 




perilipin 


RC_AA126743_at 




Z97200 PAC chrom 1q24; 
PMX1 homeobox gene 


RCLAA236547_at 




no homology \ 


RC_AA255567_at 




angiopoietin-related protein-2; 
angiopoietin-like 2 | 


RCIAA421256_at 






RC_AA386386_s 

at "•' ■ '.■ 


PPPP 
P 




RC_AA452549_at 


PPPP 
P 


PR01659; hypothetical protein 
chrom 1 1 



M63262_at 

R67290_at 
N36619_at 
L19161_at 

RC_AA496035 
L29217_s_at 
RC_W73194_a 
RC_N69507_a 

RC_H15814_s 
M84526_at 



5-lipoxygenase activating protein (FLAP), 
13q12 

Interleukine 14 

translation initiation factor 2, subunit 3", 
Xp22.2-22.1 

Chromosome 1? (TIGR) 
CDC-like kinase 3 (CLK3), 15q24 
Dermatoponin, 1q12-q23 
hypothetical protein PR01847 (Alu accor- 
ding to TIGR) 

adipose most abundant gene transcript 1 
D component of complement (adipsin) 



or any gene comprising a sequence as identified below 



U5731 6_at Human GCN5 (hGCN5) "gene," complete cds 
X66839_at H.sapiens MaTu MINI mRNA for p54/58N protein 

J04599_at Human hPGI mRNA encoding bone small proteoglycan I "(biglycan)," com- 
plete cds 

X57579_s_at H.sapiens activin beta-A subunit (exon 2) 

J02874_at Human adipocyte lipid-binding "protein," complete cds 

M1 1749_at Human Thy-1 glycoprotein "gene," complete cds 

U06863_at Human follistatin-related protein precursor "mRNA," complete cds 

U51010_s_at "Human nicotinamide N-methyltransferase ""gene,"" exon 1 and 5' flanking 

region. /gb=U51 01 0 /ntype=DNA /annot=exon" 
U08021_at "Human nicotinamide N-methyltransferase (NNMT) ""mRNA,"" complete 

cds" 

HG3044- """Fibronectin,"" Alt. Splice 1 " 
HT3742_s_at 
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X02761_s_at 

X02544_at 

M62505_at 

J05070_at 

U16306_at 

M14218_at 
L77567_s_at 

M63391_rna1 
_at 

D13643_at 
D79985 at 



Human mRNA for fibronectin (FN precursor) 
Human mRNA for alpha"! -acid glycoprotein (orosomucoid) 
Human C5a anaphylatoxin receptor "mRNA," complete cds 
Human type IV collagenase "mRNA," complete cds 

Human chondroitin sulfate proteoglycan versican VO splice-variant precursor 
peptide "mRNA," complete cds 

Human argininosuccinate lyase "mRNA," complete cds 

"Homo sapiens mitochondrial citrate transport protein (CTP) ""mRNA,"" 3' 

end" 

Human desmin gene, complete cds. 



Human mRNA for KIAA0018 
Human mRNA for KIAA0163 



gene," complete cds 



Human adipocyte lipid-binding "protein," complete cds 


J02874 


Human A1 protein "mRNA," complete cds 


U29680 


Human LGN protein "mRNA," complete cds 


U54999 


Human skeletal muscle LIM-protein SLIM2 "mRNA," partial 
cds 


U60116 


Human mRNA for alphal -acid glycoprotein (orosomucoid) 


X02544 


Human mRNA for fibronectin receptor alpha subunit 


X06256 


H saoiens P1-Cdc21 mRNA 


X74794 


H. sapiens mRNA for fibulin-2 


X82494 


H. sapiens 5T4 gene for 5T4 Oncofetal antigen 


Z29083 


Homo sapiens mRNA for osteoblast specific factor 2 (OSF- 
2os) 


D 13666 


Mac25 


HG987-HT987 


"Human lysozyme ""mRNA,"" complete cds with an Alu 
repeat in the 3 l flank" 


J03801 


Human metalloproteinase (HME) "mRNA," complete cds 


L23808 


Human alpha-1 collagen type IV gene, exon 52. 


M26576 


Human lumican "mRNA," complete cds 


U21128 


Human mRNA for fibronectin (FN precursor) 


X02761 


Human mRNA fragment for elongation factor TU (N- 
terminus). /gb=X03689 /ntype=RNA 


X03689 


Human mRNA for type IV collagen alpha -2 chain 


X05610 


Human mRNA for collagen VI alpha-1 C-terminal globular 
domain 


X15880 


"H.sapiens," gene for Membrane cofactor protein 


X59405 


H. sapiens SOD-2 gene for manganese superoxide dismu- 
tase. /gb=X65965 /ntype=DNA /annot=exon 


X65965 | 


H.sapiens NMB mRNA 


X76534 


H.sapiens vimentin gene 


Z19554 


Human chaperonin 10 "mRNA," complete cds 


U07550 


H.sapiens RING4 cDNA 


X57522 


H.sapiens genes TAP1, TAP2, LMP2, LMP7 and DOB. 


X66401 


H.sapiens mRNA for alpha 4 protein 


Y08915 


Homo sapiens interleukin-1 receptor-associated kinase 
(IRAK) "mRNA," complete cds 


L76191 


"Human von Willebrand factor ""mRNA,"" 3' end" 


M 10321 


Human chromosome segregation gene homolog CAS 
"mRNA," complete cds 


U33286 
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Human Bruton's tyrosine kinase-associated protein-135 
"mRNA," complete cds. 


U77948 


"Human KH type splicing regulatory protein KSRP 
""mRNA,"" complete cds." 


U94832 


H.sapiens ADE2H1 mRNA showing homologies to SAICAR 
synthetase and AIR carboxylase of the purine pathway (EC 
"6.3.2.6," EC 4.1.1.21) 


X53793 


Globin,"" Beta" 


HG1428- 
HT1 428 


"Human alpha-1 collagen type I ""gene,"" 3" end" 


M55998 


H.sapiens mRNA for SOX-4 protein 


X70683 


"Human mRNA for collagen binding protein ""2,"" complete 
cds" 


D83174 


Human SPARC/osteonectin "mRNA," complete cds 


J03040 


Human PRAD1 mRNA for cyclin 


X59798 



a Dukes C stage gene is selected individually from any gene comprising a sequence 
as identified below 



RC D45556 at 




chrom 15; AL390085 clone 


RC_W86214_at 






RC AA039439 s 
at 




novel gene Kl A A0 134 protein 
19q13.3 


RC AA1 28935 at 






RC_AA134158_S 
_at 




class I homeodomain; homeo- 
box protein, chrom 7 


RC_AA232646_at 




chrom 17, AF266756 sphingo- 
sine kinase (SPHK1 


RC_AA401184_at 




no homology 


RC_AA436840_at 






RC AA488655 at 






RC_AA181902_at 


PPPP 
P 


AC007201 on chrom 19 (only 
80nt horn) 



RC_AA1 22350 
AA374109_at 

RC_AA621755 
RC_AA442069 
RC_T40767_a 
RC_AA488655 
RC_AA398908 
RC_AA447764 

RC_N69136_a 



chromosome 8 

spondin 2, extracellular matrix 
protein, chromosome 4 
transcription factor Dp-2, 3q23 
sodium channel 2, 12q12 
chromosome 19 
Mus? 

hypothetical protein, chromosome 
4 



or any gene comprising a sequence as identified below 



M20681_at Human glucose transporter-like protein-Ill "(GLUT3)," complete cds 
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D50914_at 
L37362_at 

X66114_rna1 
_at 

M32053_at 
Y00787_s_at 

U64444_at 

X95325_s_at 
X02419_rna1 

S at 

X57522_at 

AB001325_at 

AB002315_at 

L12760_s_at 

M80899_at 



Human mRNA for KIAA0124 "gene," partial cds 

Homo sapiens (clone d2-115) kappa opioid receptor (OPRK1) 

"mRNA," complete cds 

H.sapiens gene for 2-oxoglutarate carrier protein. 

Human H19 RNA "gene," complete cds (spliced in silico) 

Human mRNA for MDNCF (monocyte-derived neutrophil chemotactic 

factor) 

Human ubiquitin fusion-degradation protein (UFD1L) "mRNA," com- 
plete cds 

H.sapiens mRNA for DNA binding protein A variant 
H.sapiens uPA gene 

H.sapiens RING4 cDNA 

Human AQP3 gene for aquaporine 3 (water "channel)," partail cds 
Human mRNA for KIAA0317 "gene," complete cds. /gb=AB002315 
/ntype=RNA 

"Human phosphoenolpyruvate carboxykinase (PCK1) ""gene,"" com- 
plete cds with repeats" 



Ribosomal Protein L39 Homolog 


HG2874- 
HT3018 


Homo sapiens (clone d2-1 1 5) kappa opioid receptor 
(OPRK1) "mRNA," complete cds 


L37362 


Human kell blood group protein mRNA 


M64934 




U73167 


Human cancellous bone osteoblast mRNA for serin pro- 
tease with IGF-binding "motif," complete cds 


D87258 


Human interferon-inducible protein 27-Sep "mRNA," com- 
plete cds 


J04164 


"Human sickle cell beta-globin ""mRNA,"" complete cds" 


M25079 




M29277 


"Human spermidine synthase ""mRNA,"" complete cds" 


M34338 


Human copine I "mRNA," complete cds 


U83246 


Globin,"" Beta" 


HG1428- 
HT1428 


"Human alpha-1 collagen type I ""gene,"" 3' end" 


M55998 


H.sapiens mRNA for SOX-4 protein 


X70683 


"Human mRNA for collagen binding protein ""2,"" complete 
cds" 


D83174 


Human SPARC/osteonectin "mRNA," complete cds n 


J03040 


Human PRAD1 mRNA for cyclin 


X59798 



a Dukes D stage gene is selected individually from any gene comprising a sequence 
as identified below 



RC_N91920_at 



AAAA 
P 



chrom 16p12-p11.2 ; 
XN_007994 retinoblastoma 
binding protein 
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RC AA621601_at 



AAAA 
P 



chrom 17 XM_009868 RAB36 
ARS oncogene family 



RC_AA121433 
RC_N91920_a 

RC.AA621601 

RC_AA454020 

RC_Z39652_a 



Axin, chromosome 16 
RB protein binding protein, 
chromosome 16 
GTP-binding protein Rab36, 
chromosome 17 
NADPH quinone oxidoreducta- 
se homolog; p53 induced, 
chromosome 2 
APM-1 gene, chromosome 18 



or any gene comprising a sequence as identified below 



X1 7644_s_ Human GST1 -Hs mRNA for GTP-binding protein 
at 

Y12812_at H.sapiens RFXAP mRNA 

X60486_at H.sapiens H4/g gene for H4 histone 

X52221_at H.sapiens ERCC2 "gene, 11 exons 1 & 2 (partial) 

L06175_at Homo Sapiens P5-1 "mRNA," complete cds 

Z48481_at H.sapiens mRNA for membrane-type matrix metallopro- 
teinase 1 

X54232_at Human mRNA for heparan sulfate proteaglycan (glypican) 

L08010_at "Homo sapiens reg gene ""homologue,"" complete cds" 

L27706_at Human chaperonin protein (Tcp20) gene complete cds 

L15533_rna Homo sapiens pancreatits-associated protein (PAP) gene, 

1_at complete cds. 

X51408_at Human mRNA for n-chimaerin 

K02765_at Human complement component C3 "mRNA," alpha and beta 



"subunits," complete cds 
U38904_at Human zinc finger protein C2H2-25 "mRNA," complete cds 



Homo sapiens FRG1 "mRNA," complete cds 


L76159 


Human cyclin protein "gene," complete cds 


M15796 


Human U2 small nuclear RNA-associated B" antigen 
"mRNA," complete cds 


M15841 


Human mRNA export protein Rae1 (RAE1) "mRNA," com- 
plete cds. 


U84720 


Human protease-activated receptor 3 (PAR3) "mRNA," 
complete cds. 


U92971 


H.sapiens mRNA for mediator of receptor-induced toxicity 


X84709 


H.sapiens RFXAP mRNA 


Y12812 


Human mRNA for "Qip1 ," complete cds 


AB002533 


Human mRNA for transferrin receptor 


X01060 


"metastasis-associated gene ""[human,"" highly metastatic 
lung cell subline im Anip[937]r mRNA ""Partial,"" 978 nt] N 


S79219 



The genes selected may be a gene from each gene group being expressed in a 
significantly lower amount in that stage than in one of the other stages, such as: 
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a Dukes A stage gene is selected individually from any gene comprising a sequence 
as identified below 



RC_N32411_f_at 


PAPP 
P 


Myc-associated zinc-finger 
protein of human islet; chrom 
16 


RC AA243858 at 


PAPP 
P 


KIAA0882 protein 


RC_AA486283_at 


PAPP 
P 


ras-like protein; ras-related C3 
botulinum toxin substrate; 
dJ20J23 


RC AA490930 at 


PAPP 
P 


chrom 18; KIAA1 468 protein 


RC_H54088_s_at 


PPPP 
P 


ribosomal protein L41 


RC_H59052_f_at 


PPPP 
P 


fungal sterol-C5-desaturase 
homolog; ORF; thymosin beta- 
4 


RC_R49198_s_at 


PPPP 
P 




RC_T73572_f_at 


PPPP 
P 


ferritin L-chain; L apoferritin 


RC_AA477483_at 


PPPP 
P 


no matching est 



5 or any gene comprising a sequence as identified below 



Homo sapiens SKB1 Hs "mRNA," complete cds. AF01 591 3 
/gb=AF015913 /ntype=RNA 

Mucin (Gb:M22406) HG1 067- 

HT1067 

Human platelet activating factor "acetylhydrolase," brain U72342 
"isoform," 45 kDa subunit (LIS1) gene 

Homosapiens ERK activator kinase (MEK2) mRNA L1 1 285 

Human 20-kDa myosin light chain (MLC-2) ■ mRNA," J02854 
complete cds 

H.sapiens lysosomal acid phosphatase gene (EC 3.1 .3.2) X15525 
Exon 1 (and joined CDS). 

Human mRNA for matrix Gla protein X53331 

H.sapiens mRNA for diacylglycerol kinase X62535 

Human heat shock protein (hsp 70) gene, complete cds. M1 1717 

Human TRPM-2 protein gene M63379 



a Dukes B stage gene is selected individually from any gene comprising a sequence 
10 as identified below 
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RC_D59847_at 


PPAP 
P 


proSAAS; granin-like neuroen- 
docrine peptide precursor 


RC_F05038_at 


PPAP 
P 


polyamine modulated factor-1 ; 
polyamine modulated factor 1 


RC_N41059_at 


PPAP 
P 


chrom 3 


RC_T23460_at 


PPAP 
P 


chrom 3; IFNAR2 21q22. 1 1 


RC_W42789Iat 


PPAP 
P 


chrom 8 AF268037 C80RF4 
protein (C80RF4) chrom 8 
ORF 


RC_AA460017_i_ 
at 


PPAP 
P 


BAC clone chrom 16 


RC_AA482127_at 


PPAP 
P 


KIAA1142 protein 


RC_AA504806_at 


PPAP 
P 


chrom 2 AF052107 clone 
23620 mRNA sequence 


RC_T90037_at 


PPPP 
P 


unnamed protein product, 
chrom 4 


RC_AA432130_at 


PPPP 
P 


KIAA0867 protein, chrom 12 



or any gene comprising a sequence as identified below 



Human gene for mitochondrial acetoacetyl-CoA thiolase D1 051 1 

Human mRNA for transcription factor "AREB6," complete D1 5050 
cds 

Human mRNA for KIAA0248 "gene," partial cds D87435 

Homo sapiens (clone CC6) NADH-ubiquinone oxidoreduc- L04490 
tase subunit "mRNA," 3' end cds 

Human phosphoglucomutase 1 (PGM1) "mRNA," com- M83088 
plete cds 

Homo sapiens guanylin "mRNA," complete cds M97496 

"Human trans-Golgi p230 ""mRNA,"" complete cds" U41 740 

H. sapiens mRNA for vacuolar proton "ATPase," subunit D X71490 

H.sapiens mRNA for 3-hydroxy-3-methylglutaryl coen- X8361 8 
zyme A synthase 

Human mRNA for KIAA0018 "gene," complete cds D13643 

"Mucin ""1 ,"" ""Epithelial,"" Alt. Splice 9" HG371 - 

HT26388 

H.sapiens mRNA for L-3-hydroxyacyl-CoA dehydrogenase X96752 



a Dukes C stage gene is selected individually from any gene comprising a sequence 
as identified below 



RC_N30231_at 



PPPA 
P 



Lsm4 protein; U6 snRNA- 
associated Sm-like protein 
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LSm4; glycine-rich protein 


RC_W73790_f_at 


PPPA 
P 


immunoglobulin-related pro- 
tein 14.1; lambda L-chain C 
region; omega protein, chrom 
22 


RC_AA412184_at 


PPPA 
P 


chrom 1p36; d89060 dolichyl- 
diphosphooligosaccharide- 
protein glycosyltransferase 


RC_AA521303_at 


PPPA 
P 


methionine adenosyltransfera- 
se regulatory beta subunit; 
dTD P-4-keto-6-deoxy- D- 
glucose 4-reductase, chrom 5 


RC_AA461174_at 


PPPP 
P 


8p21.3-p22 AB020860 anti- 
oncoqene 


AA393432_s_at 


PPPP 
P 


chrom 2, Unknown; unnamed 
protein product AAD20029 



or any gene comprising a sequence as identified below 



Homo sapiens colon mucosa-associated (DRA) L02785 
"mRNA," complete cds 

Human Ig J chain gene M12759 
Human selenium-binding protein (hSBP) "mRNA," U29091 
complete cds, /gb=U29091 /ntype=RNA 

H.sapiens mRNA for sigma 3B protein X99459 
Human ERK1 mRNA for protein serine/threonine kina- X60188 
se 

Human mRNA for mitochondrial 3-oxoacyl-CoA "thio- D16294 
lase," complete cds 

"Biliary ""Glycoprotein,"" Alt. Splice ""5,"" A" HG2850- 

HT4814 

Human AQP3 gene for aquaporine 3 (water "channel)," AB001325 
partail cds 

Human CD14 mRNA for myelid cell-specific leucine-rich XI 3334 
glycoprotein 

Human thioredoxin "mRNA," nuclear gene encoding mito- U78678 
chondrial "protein," complete cds 

Human mitochondrial ATPase coupling factor 6 subunit M37104 
(ATP5A) "mRNA," complete cds 

"Human MHC class II H LA-DP light chain ""mRNA,"" com- M57466 
plete cds" 

Human mRNA for early growth response protein 1 X52541 
(hEGR1) 

Human mRNA for mitochondrial 3-ketoacyl-CoA thiolase D1 6481 
beta-subunit of trifunctional "protein," complete cds 
Homo sapiens laminin-related protein (LamA3) "mRNA," L34155 
complete cds 

H.sapiens mRNA for selenoprotein P Z1 1793 

Human hkf-1 "mRNA," complete cds D76444 
Homo sapiens nuclear domain 10 protein (ndp52) "mRNA," U22897 
complete cds 
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Human X1 04 "mRNA," complete cds L27476 

H. sapiens cDNA for RFG X77548 

H. sapiens mRNA for Progression Associated Protein Y07909 

Human liver M 2 f 4-dienoyl-CoA n reductase "mRNA," com- U49352 
plete cds 

Human A33 antigen precursor "mRNA," complete cds U79725 

H.sapiens pS2 protein gene X52003 

Human RASF-A PLA2 "mRNA," complete cds M22430 

Homo sapiens pstl mRNA for pancreatic secretory inhibitor Y00705 
(expressed in neoplastic tissue). 

Human CO-029 M35252 



a Dukes D stage gene is selected individually from any gene comprising a sequence 
as identified below 

5 



RC_R72886_s_at 


PPPP 
A 


KIAA0422; adenylyl cyclase 
type VI, chrom 12 


RC_AA026030_at 


PPPP 
A 


chrom 1 


RC_Z39006_at 


PPPP 
A 


hypothetical protein, chrom 17 


RC_AA435908_at 


PPPP 
A 


chrom 19; ac01 1491 clone and 
20 nt horn. RAB2, RAS onco- 
gene family 


RC_AA057829_S 
_at 


PPPP 
A 


growth-arrest-specific protein; 
growth arrest-specific 6; AXL 
stimulatory factor, chrom 13 


RC_R72087_at 


PPPP 
A 


chrom 5 EST; horn to chrom 
20 AL356652 clone 


RC_H04242_at 


PPPP 
A 


ras related protein Rab5b; 
RAB5B, member RAS onco- 
gene family 


RC_R97304_f_at 


PPPP 
A 


HLA-drb5; cell surface gly- 
coprotein; MHC HLA-DR-beta 
chain precursor chrom 6 


RC_N48609_at : 


PPPP 
A 


chrom 1 1; AC004584 chrom 
17 


RC_W86850_f_at 


PPPP 
A 


chrom 22 ? X96924 mito- 
chondrial citrate tranbsport 
region 


RC_AA130603_at 


PPPP 
A 


ak024908 clone 


RC_AA479610_at 


PPPP 
A 


singleton ak025344 clone 


RC_AA490593_i_ 
at 


PPPP 
A 


chrom 1 7 ? Synaptobrevin2 
(VAMP2) AF135372 


RC_AA054321_s 
at 


PPPP 
A 


6p21 HLA class i region; 
AC004202 clone 
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RC_D60328_at 


PPPP 
P 


chrom 6, unknown; ring finger 
protein 5 


RC H96850 at 


PPPP 
P 


oligosaccharyltransferase 
689060 1p36. 1 (also C-class) 


RC_AA127444_at 


PPPP 
P 


chrom 1 no homology 


RC_AA242824_at 


PPPP 
P 


chrom 1 1; ac005233 PAC clo- 
ne chrom 22 


AA405775_S_at 


PPPP 
P 


similar to CAA1 6821 
(PID:g3255952) 



or any gene comprising a sequence as identified below 

K02765 

Z69881 

U60115 

M21574 

D87434 
D79993 
U28833 

U11292 



Human complement component C3 "mRNA," alpha and 
beta "subunits," complete cds 

H.sapiens mRNA for adenosine "triphosphatase," cal- 
cium 

Human skeletal muscle LIM-protein SLIM1 "mRNA," com- 
plete cds 

Human platelet-derived growth factor receptor alpha 
(PDGFRA) "mRNA," complete cds 
Human mRNA for KIAA0247 "gene," complete cds 
Human mRNA for KIAA0171 "gene," complete cds 
Human Down syndrome critical region protein (DSCR1) 
"mRNA," complete cds 

Human Ki nuclear autoantigen "mRNA," complete cds 



5 Expression patterns 

The objects of the invention are achieved by providing one or more of the 
embodiments described below. In one embodiment a method is provided of 
determining an expression pattern of a cell sample preferably independent of the 
10 proportion of submucosal, muscle and connective tissue cells present. Expression is 
determined of one or more genes in a sample comprising cells, said genes being 
selected from the same genes as discussed above and shown in the tables of the 
Examples. 

15 It is an object of the present that characteristic patterns of expression of genes can 
be used to characterize different types of tissue. Thus, for example gene expression 
patterns can be used to characterize stages and grades of colorectal tumors. 
Similarly, gene expression patterns can be sued to distinguish cells having a 
colorectal origin from other cells. Moreover, gene expression of cells which routinely 

20 contaminate colorectal tumor biopsies has been identified, and such gene 
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expression can be removed or subtracted from patterns obtained from colorectal 
biopsies. Further, the gene expression patterns of single-cell solutions of colorectal 
tumor cells have been found to be far freer of interfering expression of 
contaminating muscle, submucosal, and connective tissue cells that biopsy 
5 samples. 

The one or more genes exclude genes which are expressed in the submucosal, 
muscle, and connective tissue. A pattern of expression is formed for the sample 
which is independent of the proportion of submucosal, muscle, and connective 
10 tissue cells in the sample. 

In another aspect of the invention a method of determining an expression pattern of 
a cell sample is provided. Expression is determined of one or more genes in a 
sample comprising cells. A first pattern of expression is thereby formed for the 
15 sample. Genes which are expressed in submucosal, muscle, and connective tissue 
cells are removed from the first pattern of expression, forming a second pattern of 
expression which is independent of the proportion of submucosal, muscle, and 
connective tissue cells in the sample. 

20 Another embodiment of the invention provides a method for determining an 
expression pattern of a colorectal mucosa or colorectal cancer cell. Expression is 
determined of one or more genes in a sample comprising colorectal mucosa or 
colorectal cancer cells; the expression determined forms a first pattern of 
expression. A second pattern of expression which was formed using the one or 

25 more genes and a sample comprising predominantly submucosal, muscle, and 
connective tissue cells, is subtracted from the first pattern of expression, forming a 
third pattern of expression. The third pattern of expression reflects expression of the 
colorectal mucosa or colorectal cancer cells independent of the proportion of 
submucosal, muscle, and connective tissue cells present in the sample. 

30 

Diagnosing 

In another embodiment of the invention a method is provided of detecting an 
invasive tumor in a patient. A marker is detected in a sample of a body fluid. The 
35 body fluid is selected from the group consisting of blood, plasma, serum, faeces, 
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mucus, sputum, cerebrospinal fluid and/or urine. The marker is an mRNA or protein 
expression product of a gene which is more prevalent in submucosal, muscle, and 
connective tissue than in the body fluid. An increased amount of the marker in the 
body fluid indicates a tumor which has become invasive in the patient. 

5 

In another aspect of the invention a method is provided for diagnosing a colorectal 
cancer. A first pattern of expression is determined of one or more genes in a colonic 
tissue sample suspected of being neoplastic. The first pattern of expression is 
compared to a second and third reference pattern of expression. The second pattern 
10 is of the one or more genes in normal colorectal mucosa and the third pattern is of 
the one or more genes in colorectal cancer. A first pattern of expression which is 
found to be more similar to the third pattern than the second indicates neoplasia of 
the colorectal tissue sample. 

15 According to yet another aspect of the invention a method is provided for predicting 
outcome or prescribing treatment of a colorectal tumor. A first pattern of expression 
is determined of one or more genes in a colorectal tumor sample. The first pattern is 
compared to one or more reference patterns of expression determined for colorectal 
tumors at a grade between I and IV. The reference pattern which shares maximum 

20 similarity with the first pattern is identified. The outcome or treatment appropriate for 
the grade of tumor of the reference pattern with the maximum similarity is assigned 
to the colorecteal tumor sample. 

In another embodiment of the invention a method is provided for determining grade 
25 of a colorecteal tumor. A first pattern of expression is determined of one or more 
genes in a colorectal tumor sample. The first pattern is compared to one or more 
reference patterns of expression determined for colorectal tumors at a grade 
between I and IV. The grade of the reference pattern with the maximum similarity is 
assigned to the colorecteal tumor sample. 

30 

Yet another embodiment of the invention provides a method to determine stage of a 
colorectal tumor as described above. A first pattern of expression is determined of 
one or more genes in a colorectal tumor sample. The first pattern is compared to 
one or more reference patterns of expression determined for colorectal tumors at 
35 different stages. The reference pattern which shares maximum similarity with the 
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first pattern is identified. The stage of the reference pattern with the maximum 
similarity is assigned to the colorecteal tumor sample. 

In still another embodiment of the invention a method is provided for identifying a 
5 tissue sample as colo-rectal. A first pattern of expression is determined of one or 
more genes in a tissue sample. The first pattern is compared to a second pattern of 
expression determined obtained for normal mucosa cells. Similarity between the first 
and the second patterns suggests that the tissue sample is mucosa in its origin. This 
method being particularly useful when diagnosing metastasis possibly distant from 
10 its origin. 

Another aspect of the invention is a method to aid in diagnosing, predicting 
outcome, or prescribing treatment of a colorectal cancer. A first pattern of 
expression is determined of one or more genes in a first colorectal tissue sample. A 

15 second pattern of expression is determined of the one or more genes in a second 
colorectal tissue sample. The first colorectal tissue sample is a normal colorectal 
mucosa sample or an earlier stage or lover grade of colorectal tumor than the 
second colorectal tissue sample. The first pattern of expression is compared to the 
second pattern of expression to identify a first set of genes which are increased in 

20 the second colorectal tissue sample relative to the first colorectal tissue sample and 
a second set of genes which are decreased in the second colorectal tissue sample - 
relative to the first colorectal tissue sample. Those genes which are expressed in 
submucosal, muscle or connective tissue are removed from the first set of genes. 
Those genes which are not expressed in submucosal, muscle, or connective tissue 

25 are removed from the second set of genes. 

Independence of submucosal, muscle and connective tissue 

Since a biopsy of the tissue often contains more tissue material, than the tissue to 
30 be examined, such as connective tissue, when the tissue to be examined is 
epithelial or mucosa, the invention also relates to methods, wherein the expression 
pattern of the tissue is independent of the amount of connective tissue in the 
sample. 
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Biopsies contain epithelial cells that most often are the targets for the studies, and in 
addition many other cells that contaminate the epithelial cell fraction to a varying 
extent. The contaminants include histiocytes, endothelial cells, leukocytes, nerve cells, 
muscle cells etc. Micro dissection is the method of choice for DNA examination, but in 
5 case of expression studies this procedure is difficult due to RNA degradation during the 
procedure. The epithelium may be gently removed and the expression in the remaining 
submucosa and underlying connective tissue (the colon wall) monitored. Genes 
expressed at high or low levels in the colon wall should be interrogated when 
performing expression monitoring of the mucosa and tumors. A similar approach could 
1 0 be used for studies of epithelia in other organs. 

Normal mucosa lining the colon lumen from colons for colon cancer was scraped off. 
Then biopsies were taken from the denuded submucosa and connective tissue, 
reaching approximately 5 mm into the colon wall, and immediately disintegrated in 
15 guanidinium isothiocyanate. Total RNA may be extracted, pooled, and poly(A) + mRNA 
may be prepared from the pool followed by conversion to double-stranded cDNA and 
in vitro transcription into cRNA containing biotin-labeled CTP and UTP. 

Genes that are expressed and genes that are not expressed in colon wall can both 
20 interfere with the interpretation of the expression in a biopsy, and should be 
interrogated when interpreting expression intensities in tumor biopsies, as the colon 
wall component of a biopsy varies in amount from biopsy to biopsy. 

When having determined the pattern of genes expressed in colon wall components 
25 said pattern may be subtracted from a pattern obtained from the sample resulting in a 
third pattern related to the mucosa (epithelial) cells. 

In another aspect of the invention a method is provided for determining an 
expression pattern of a colorectal tissue sample independent of the proportion of 
30 submucosal, muscle and connective tissue cells present. A single-cell suspension of 
disaggregated colorectal tumor cells is isolated from a colorectal tissue sample 
comprising colorectal tumor cells is isolated form a coloretal tissue sample 
comprising colorectal cells, submucosal cells, muscle cells, and connective tissue 
cells. A pattern of expression is thus formed for the sample which is independent of 
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the proportion of submucosal, muscle, and connective tissue cells in the colorectal 
tissue sample. 

Yet another method relates to elimination mRNA from colon wall components before 
5 determining the pattern, e.g. by filtration and/or affinity chromatography to remove 
mRNA related to the colon wall. 

Detection 

10 Working with human tumor material requires biopsies, and working with RNA 
requires freshly frozen or immediately processed biopsies. Apart from the cancer 
tissue, biopsies do inevitably contain many different cell types, such as cells present 
in the blood, connective and muscle tissue, endothelium etc. In the case of DNA 
studies, microdissection or laser capture are method of choice, however the 

15 time.dependent degradation of RNA makes it difficult to perform manipulation of the 
tissue for more than a few minutes. Furthermore, studies of expressed sequences 
may be difficult on the few cells obtained via microdissection or laser capture, as 
these may have an expression pattern that deviates from the predominant pattern in 
a tumor due to large intratumoral heterogeneity. 

20 

In the present context high density expression arrays may be used to evaluate the 
impact of colorectal wall components in colorectal tumor biopsies, and tested 
preparation of single cell solutions as a means of eliminating the contaminants. The 
results of these evaluations permit us to design methods of evaluating colorectal 
25 samples without the interfering background noise caused by ubiquitous 
contaminating submucosal, muscle, and connective tissue cells. The evaluating 
assays of the invention may be of any type. 

While high density expression arrays can be used, other techniques are also 
30 contemplated. These include other techniques for assaying for specific mRNA 
species, including RT-PCR and Northern Blotting, as well as techniques for 
assaying for particular protein products, such as ELISA, Western blotting, and 
enzyme assays. Gene expression patterns according to the present invention are 
determined by measuring any gene product of a particular gene, including mRNA 
35 and protein. A pattern may be for one or more gene. 
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RNA or protein can be isolated and assayed from a test sample using any 
techniques known in the art. They can for example be isolated from fresh or frozen 
biopsy, from formalin-fixed tissue, from body fluids, such as blood, plasma, serum, 
5 urine, or sputum. 

The data provided of expression for submucosal, muscle, and connective tissue can 
be used in at least three ways to improve the quality of data for a tested sample. 
The genes identified in the data as expressed can be excluded from the testing or 
10 from the analysis. Alternatively, the intensity of expression of the genes expressed 
in the submucosal, muscle, and connective tissue can be subtracted from the 
intensity of expression determined for the tests tissue. 

The data collected and disclosed here as "connective tissue" is presumed to contain 
15 both muscle and submucosal gene expression as well. Thus it represents the 
composite expression of these cell types which can typically contaminate a 
colorectal biopsy. 

Detection of expression 

20 

Expression of genes may in general be detected by either detecting mRNA from the 
cells and/or detecting expression products, such as peptides and proteins. 

mRNA detection 

25 

The detection of mRNA of the invention may be a tool for determining the 
developmental stage of a cell type may be definable by its pattern of expression of 
messenger RNA. For example, in particular stages of cells, high levels of ribosomal 
RNA are found whereas relatively low levels of other types of messenger RNAs may 

30 be found. Where a pattern is shown to be characteristic of a stage, a stage may be 
defined by that particular pattern of messenger RNA expression. The mRNA 
population is a good determinant of developmental stage, will be correlated with 
other structural features of the cell. In this manner, cells at specific developmental 
stages will be characterized by the intracellular environment, as well as the 

35 extracellular environment. The present invention also allows the combination of 
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definitions based, in part, upon antigens and, in part, upon mRNA expression. 
In one embodiment, the two may be combined in a single incubation step. A 
particular incubation condition may be found which is compatible with both 
hybridization recognition and non-hybridization recognition molecules. Thus, e.g., an 
5 incubation condition may be selected which allows both specificity of antibody 
binding and specificity of nucleic acid hybridization. This allows simultaneous 
performance of both types of interactions on a single matrix. Again, where 
developmental mRNA patterns are correlated with structural features, or with probes 
which are able to hybridize to intracellular mRNA populations, a cell sorter may be 
10 used to sort specifically those cells having desired mRNA population patterns. 

It is within the general scope of the present invention to provide methods for the 
detection of mRNA. Such methods often involve sample extraction, PCR 
amplification, nucleic acid fragmentation and labeling, extension reactions, 
15 transcription reactions and the like. 

Sample preparation 

The nucleic acid (either genomic DNA or mRNA) may be isolated from the sample 
20 according to any of a number of methods well known to those of skill in the art. One 
of skill will appreciate that where alterations in the copy number of a gene are to be 
detected genomic DNA is preferably isolated. Conversely, where expression levels 
of a gene or genes are to be detected, preferably RNA (mRNA) is isolated. 

25 Methods of isolating total mRNA are well known to those of skill in the art. In one 
embodiment, the total nucleic acid is isolated from a given sample using, for 
example, an acid guanidinium-phenol-chloroform extraction method and polyA.sup.+ 
mRNA is isolated by oligo dT column chromatography or by using (dT)n magnetic 
beads (see, e.g., Sambrook et al., Molecular Cloning: A Laboratory Manual (2nd 

30 ed.), Vols. 1-3, Cold Spring Harbor Laboratory, (1989), or Current Protocols in 
Molecular Biology, F. Ausubel et al., ed. Greene Publishing and Wiley-lnterscience, 
New York (1987)). 



The sample may be from tissue and/or body fluids, as defined elsewhere herein. 
35 Before analyzing the sample, e.g., on an oligonucleotide array, it will often be 



WO 01/49879 PCT/DK00/00744 

42 

desirable to perform one or more sample preparation operations upon the sample. 
Typically, these sample preparation operations will include such manipulations as 
extraction of intracellular material, e.g., nucleic acids from whole cell samples, 
viruses and the like, amplification of nucleic acids, fragmentation, transcription, 
5 labeling and/or extension reactions. One or more of these various operations may 
be readily incorporated into the device of the present invention. 

DNA Extraction 

10 DNA extraction may be relevant in case possible mutations in the genes are to be 
dtermined in addition to the determination of expression of the genes. 

For those embodiments where whole cells, or other tissue samples are being 
analyzed, it will typically be necessary to extract the nucleic acids from the cells or 
15 viruses, prior to continuing with the various sample preparation operations. 
Accordingly, following sample collection, nucleic acids may be liberated from the 
collected cells, viral coat, etc., into a crude extract, followed by additional treatments 
to prepare the sample for subsequent operations, e.g., denaturation of 
contaminating (DNA binding) proteins, purification, filtration, desalting, and the like. 

20 

Liberation of nucleic acids from the sample cells, and denaturation of DNA binding 
proteins may generally be performed by physical or chemical methods. For 
example, chemical methods generally employ lysing agents to disrupt the cells and 
extract the nucleic acids from the cells, followed by treatment of the extract with 
25 chaotropic salts such as guanidinium isothiocyanate or urea to denature any 
contaminating and potentially interfering proteins. 

Alternatively, physical methods may be used to extract the nucleic acids and 
denature DNA binding proteins, such as physical protrusions within microchannels 
30 or sharp edged particles piercing cell membranes and extract their contents. 
Combinations of such structures with piezoelectric elements for agitation can 
provide suitable shear forces for lysis. 

More traditional methods of cell extraction may also be used, e.g., employing a 
35 channel with restricted cross-sectional dimension which causes cell lysis when the 
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sample is passed through the channel with sufficient flow pressure. Alternatively, 
cell extraction and denaturing of contaminating proteins may be carried out by 
applying an alternating electrical current to the sample. More specifically, the sample 
of cells is flowed through a microtubular array while an alternating electric current is 
5 applied across the fluid flow. Subjecting cells to ultrasonic agitation, or forcing cells 
through microgeometry apertures, thereby subjecting the cells to high shear stress 
resulting in rupture are also possible extraction methods. 

Filtration 

10 

Following extraction, it will often be desirable to separate the nucleic acids from 
other elements of the crude extract, e.g., denatured proteins, cell membrane 
particles, salts, and the like. Removal of particulate matter is generally accomplished 
by filtration, flocculation or the like. Further, where chemical denaturing methods are 

1 5 used, it may be desirable to desalt the sample prior to proceeding to the next step. 
Desalting of the sample, and isolation of the nucleic acid may generally be carried 
out in a single step, e.g., by binding the nucleic acids to a solid phase and washing 
away the contaminating salts or performing gel filtration chromatography on the 
sample, passing salts through dialysis membranes, and the like. Suitable solid 

20 supports for nucleic acid binding include, e.g., diatomaceous earth, silica (i.e., glass 
wool), or the like. Suitable gel exclusion media, also well known in the art, may also 
be readily incorporated into the devices of the present invention, and is 
commercially available from, e.g., Pharmacia and Sigma Chemical. 

25 Alternatively, desalting methods may generally take advantage of the high 
electrophoretic mobility and negative of DNA compared to other elements. 
Electrophoretic methods may also be utilized in the purification of nucleic acids from 
other cell contaminants and debris. Upon application of an appropriate electric field, 
the nucleic acids present in the sample will migrate toward the positive electrode 

30 and become trapped on the capture membrane. Sample impurities remaining free of 
the membrane are then washed away by applying an appropriate fluid flow. Upon 
reversal of the voltage, the nucleic acids are released from the membrane in a 
substantially purer form. Further, coarse filters may also be overlaid on the barriers 
to avoid any fouling of the barriers by particulate matter, proteins or nucleic acids, 
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thereby permitting repeated use. 

Separation of contaminants by chromatography 

5 In a similar aspect, the high electrophoretic mobility of nucleic acids with their 
negative charges, may be utilized to separate nucleic acids from contaminants by 
utilizing a short column of a gel or other appropriate matrix or gel which will slow or 
retard the flow of other contaminants while allowing the faster nucleic acids to pass. 

10 This invention provides nucleic acid affinity matrices that bear a large number of 
different nucleic acid affinity ligands allowing the simultaneous selection and 
removal of a large number of preselected nucleic acids from the sample. Methods of 
producing such affinity matrices are also provided. In general the methods involve 
the steps of a) providing a nucleic acid amplification template array comprising a 

15 surface to which are attached at least 50 oligonucleotides having different nucleic 
acid sequences, and wherein each different oligonucleotide is localized in a 
predetermined region of said surface, the density of said oligonucleotides is greater 
than about 60 different oligonucleotides per 1 cm. sup. 2, and all of said different 
oligonucleotides have an identical terminal 3' nucleic acid sequence and an identical 

20 terminal 5' nucleic acid sequence, b) amplifying said multiplicity of oligonucleotides 
to provide a pool of amplified nucleic acids; and c) attaching the pool of nucleic 
acids to a solid support. 

For example, nucleic acid affinity chromatography is based on the tendency of 
25 complementary, single-stranded nucleic acids to form a double-stranded or duplex 
structure through complementary base pairing. A nucleic acid (either DNA or RNA) 
can easily be attached to a solid substrate (matrix) where it acts as an immobilized 
ligand that interacts with and forms duplexes with complementary nucleic acids 
present in a solution contacted to the immobilized ligand. Unbound components can 
30 be washed away from the bound complex to either provide a solution lacking the 
target molecules bound to the affinity column, or to provide the isolated target 
molecules themselves. The nucleic acids captured in a hybrid duplex can be 
separated and released from the affinity matrix by denaturation either through heat, 
adjustment of salt concentration, or the use of a destabilizing agent such as 
35 formamide, TWEEN.TM.-20 denaturing agent, or sodium dodecyl sulfate (SDS). 
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Affinity columns (matrices) are typically used either to isolate a single nucleic acid 
typically by providing a single species of affinity ligand. Alternatively, affinity columns 
bearing a single affinity ligand (e.g. oligo dt columns) have been used to isolate a 
5 multiplicity of nucleic acids where the nucleic acids all share a common sequence 
(e.g. a polyA). 

Affinity matrices 

10 The type of affinity matrix used depends on the purpose of the analysis. For 

example, where it is desired to analyze mRNA expression levels of particular genes 
in a complex nucleic acid sample (e.g., total mRNA) it is often desirable to eliminate 
nucleic acids produced by genes that are constitutively overexpressed and thereby 
tend to mask gene products expressed at characteristically lower levels. Thus, in 

15 one embodiment, the affinity matrix can be used to remove a number of preselected 
gene products (e.g., actin, GAPDH, etc.). This is accomplished by providing an 
affinity matrix bearing nucleic acid affinity ligands complementary to the gene 
products (e.g., mRNAs or nucleic acids derived therefrom) or to subsequences 
thereof. Hybridization of the nucleic acid sample to the affinity matrix will result in 

20 duplex formation between the affinity ligands and their target nucleic acids. Upon 
elution of the sample from the affinity matrix, the matrix will retain the duplexes 
nucleic acids leaving a sample depleted of the overexpressed target nucleic acids. 

The affinity matrix can also be used to identify unknown mRNAs or cDNAs in a 
25 sample. Where the affinity matrix contains nucleic acids complementary to every 
known gene (e.g., in a cDNA library, DNA reverse transcribed from an mRNA, 
mRNA used directly or amplified, or polymerized from a DNA template) in a sample, 
capture of the known nucleic acids by the affinity matrix leaves a sample enriched 
for those nucleic acid sequences that are unknown. In effect, the affinity matrix is 
30 used to perform a subtractive hybridization to isolate unknown nucleic acid 
sequences. The remaining "unknown" sequences can then be purified and 
sequenced according to standard methods. 

The affinity matrix can also be used to capture (isolate) and thereby purify unknown 
35 nucleic acid sequences. For example, an affinity matrix can be prepared that 
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contains nucleic acid (affinity ligands) that are complementary to sequences not 
previously identified, or not previously known to be expressed in a particular nucleic 
acid sample. The sample is then hybridized to the affinity matrix and those 
sequences that are retained on the affinity matrix are "unknown" nucleic acids. The 
5 retained nucleic acids can be eluted from the matrix (e.g. at increased temperature, 
increased destabilizing agent concentration, or decreased salt) and the nucleic acids 
can then be sequenced according to standard methods. 

Similarly, the affinity matrix can be used to efficiently capture (isolate) a number of 
10 known nucleic acid sequences. Again, the matrix is prepared bearing nucleic acids 
complementary to those nucleic acids it is desired to isolate. The sample is 
contacted to the matrix under conditions where the complementary nucleic acid 
sequences hybridize to the affinity ligands in the matrix. The non-hybridized material 
is washed off the matrix leaving the desired sequences bound. The hybrid duplexes 
15 are then denatured providing a pool of the isolated nucleic acids. The different 
nucleic acids in the pool can be subsequently separated according to standard 
methods (e.g. gel electrophoresis). 

As indicated above the affinity matrices can be used to selectively remove nucleic 
20 acids from virtually any sample containing nucleic acids (e.g., in a cDNA library, 
DNA reverse transcribed from an mRNA, mRNA used directly or amplified, or 
polymerized from a DNA template, and so forth). The nucleic acids adhering to the 
column can be removed by washing with a low salt concentration buffer, a buffer 
containing a destabilizing agent such as formamide, or by elevating the column 
25 temperature. 

In one particularly preferred embodiment, the affinity matrix can be used in a method 
to enrich a sample for unknown RNA sequences (e.g. expressed sequence tags 
(ESTs)). The method involves first providing an affinity matrix bearing a library of 
30 oligonucleotide probes specific to known RNA (e.g., EST) sequences. Then, RNA 
from undifferentiated and/or unactivated cells and RNA from differentiated or 
activated or pathological (e.g., transformed) or otherwise having a different 
metabolic state are separately hybridized against the affinity matrices to provide two 
pools of RNAs lacking the known RNA sequences. 
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In a preferred embodiment, the affinity matrix is packed into a columnar casing. The 
sample is then applied to the affinity matrix (e.g. injected onto a column or applied to 
a column by a pump such as a sampling pump driven by an autosampler). The 
affinity matrix (e.g. affinity column) bearing the sample is subjected to conditions 
5 under which the nucleic acid probes comprising the affinity matrix hybridize 
specifically with complementary target nucleic acids. Such conditions are 
accomplished by maintaining appropriate pH, salt and temperature conditions to 
facilitate hybridization as discussed above. 

10 For a number of applications, it may be desirable to extract and separate messenger 
RNA from cells, cellular debris, and other contaminants. As such, the device of the 
present invention may, in some cases, include an mRNA purification chamber or 
channel. In general, such purification takes advantage of the poly-A tails on mRNA. 
In particular and as noted above, poly- T oligonucleotides may be immobilized 

15 within a chamber or channel of the device to serve as affinity ligands for mRNA. 
Poly-T oligonucleotides may be immobilized upon a solid support incorporated 
within the chamber or channel, or alternatively, may be immobilized upon the 
surface(s) of the chamber or channel itself. Immobilization of oligonucleotides on the 
surface of the chambers or channels may be carried out by methods described 

20 herein including, e.g., oxidation and silanation of the surface followed by standard 
DMT synthesis of the oligonucleotides. 

In operation, the lysed sample is introduced to a high salt solution to increase the 
ionic strength for hybridization, whereupon the mRNA will hybridize to the 
25 immobilized poly-T. The mRNA bound to the immobilized poly-T oligonucleotides is 
then washed free in a low ionic strength buffer. The poy-T oligonucleotides may be 
immobiliized upon poroussurfaces, e.g., porous silicon, zeolites silica xerogels, 
scintered particles, or other solid supports. 

30 Hybridization 

Following sample preparation, the sample can be subjected to one or more different 
analysis operations. A variety of analysis operations may generally be performed, 
including size based analysis using, e.g., microcapillary electrophoresis, and/or 
35 sequence based analysis using, e.g., hybridization to an oligonucleotide array. 
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In the latter case, the nucleic acid sample may be probed using an array of 
oligonucleotide probes. Oligonucleotide arrays generally include a substrate having 
a large number of positionally distinct oligonucleotide probes attached to the 
5 substrate. These arrays may be produced using mechanical or light directed 
synthesis methods which incorporate a combination of photolithographic methods 
and solid phase oligonucleotide synthesis methods. 

Light directed synthesis of oligonucleotide arrays 

10 

The basic strategy for light directed synthesis of oligonucleotide arrays is as follows. 
The surface of a solid support, modified with photosensitive protecting groups is 
illuminated through a photolithographic mask, yielding reactive hydroxy! groups in 
the illuminated regions. A selected nucleotide, typically in the form of a 3'-0- 

15 phosphoramidite-activated deoxynucleoside (protected at the 5' hydroxy! with a 
photosensitive protecting group), is then presented to the surface and coupling 
occurs at the sites that were exposed to light. Following capping and oxidation, the 
substrate is rinsed and the surface is illuminated through a second mask, to expose 
additional hydroxyl groups for coupling. A second selected nucleotide (e.g., 5'- 

20 protected, 3'-0-phosphoramidite-activated deoxynucleoside) is presented to the 
surface. The selective deprotection and coupling cycles are repeated until the 
desired set of products is obtained. Since photolithography is used, the process can 
be readily miniaturized to generate high density arrays of oligonucleotide probes. 
Furthermore, the sequence of the oligonucleotides at each site is known. See, 

25 Pease, et al. Mechanical synthesis methods are similar to the light directed methods 
except involving mechanical direction of fluids for deprotection and addition in the 
synthesis steps. 

For some embodiments, oligonucleotide arrays may be prepared having all possible 
30 probes of a given length. The hybridization pattern of the target sequence on the 
array may be used to reconstruct the target DNA sequence. Hybridization analysis 
of large numbers of probes can be used to sequence long stretches of DNA or 
provide an oligonucleotide array which is specific and complementary to a particular 
nucleic acid sequence. For example, in particularly preferred aspects, the 
35 oligonucleotide array will contain oligonucleotide probes which are complementary 
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to specific target sequences, and individual or multiple mutations of these. Such 
arrays are particularly useful in the diagnosis of specific disorders which are 
characterized by the presence of a particular nucleic acid sequence. 

5 Following sample collection and nucleic acid extraction, the nucleic acid portion of 
the sample is typically subjected to one or more preparative reactions. These 
preparative reactions include in vitro transcription, labeling, fragmentation, 
amplification and other reactions. Nucleic acid amplification increases the number of 
copies of the target nucleic acid sequence of interest. A variety of amplification 
10 methods are suitable for use in the methods and device of the present invention, 
including for example, the polymerase chain reaction method or (PCR), the ligase 
chain reaction (LCR), self sustained sequence replication (3SR), and nucleic acid 
based sequence amplification (NASBA). 

15 The latter two amplification methods involve isothermal reactions based on 
isothermal transcription, which produce both single stranded RNA (ssRNA) and 
double stranded DNA (dsDNA) as the amplification products in a ratio of 
approximately 30 or 100 to 1, respectively. As a result, where these latter methods 
are employed, sequence analysis may be carried out using either type of substrate, 

20 i.e., complementary to either DNA or RNA. 

Frequently, it is desirable to amplify the nucleic acid sample prior to hybridization. 
One of skill in the art will appreciate that whatever amplification method is used, if a 
quantitative result is desired, care must be taken to use a method that maintains or 
25 controls for the relative frequencies of the amplified nucleic acids. 

PCR 

Methods of "quantitative" amplification are well known to those of skill in the art. For 
30 example, quantitative PCR involves simultaneously co-amplifying a known quantity 
of a control sequence using the same primers. This provides an internal standard 
that may be used to calibrate the PCR reaction. The high density array may then 
include probes specific to the internal standard for quantification of the amplified 
nucleic acid. 
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Thus, in one embodiment, this invention provides for a method of optimizing a probe 
set for detection of a particular gene. Generally, this method involves providing a 
high density array containing a multiplicity of probes of one or more particular 
length(s) that are complementary to subsequences of the mRNA transcribed by the 
5 target gene. In one embodiment the high density array may contain every probe of a 
particular length that is complementary to a particular mRNA. The probes of the high 
density array are then hybridized with their target nucleic acid alone and then 
hybridized with a high complexity, high concentration nucleic acid sample that does 
not contain the targets complementary to the probes. Thus, for example, where the 

10 target nucleic acid is an RNA, the probes are first hybridized with their target nucleic 
acid alone and then hybridized with RNA made from a cDNA library (e.g., reverse 
transcribed polyA.sup.+ mRNA) where the sense of the hybridized RNA is opposite 
that of the target nucleic acid (to insure that the high complexity sample does not 
contain targets for the probes). Those probes that show a strong hybridization signal 

15 with their target and little or no cross-hybridization with the high complexity sample 
are preferred probes for use in the high density arrays of this invention. 

PCR amplification generally involves the use of one strand of the target nucleic acid 
sequence as a template for producing a large number of complements to that 

20 sequence. Generally, two primer sequences complementary to different ends of a 
segment of the complementary strands of the target sequence hybridize with their 
respective strands of the target sequence, and in the presence of polymerase 
enzymes and nucleoside triphosphates, the primers are extended along the target 
sequence. The extensions are melted from the target sequence and the process is 

25 repeated, this time with the additional copies of the target sequence synthesized in 
the preceding steps. PCR amplification typically involves repeated cycles of 
denaturation, hybridization and extension reactions to produce sufficient amounts of 
the target nucleic acid. The first step of each cycle of the PCR involves the 
separation of the nucleic acid duplex formed by the primer extension. Once the 

30 strands are separated, the next step in PCR involves hybridizing the separated 
strands with primers that flank the target sequence. The primers are then extended 
to form complementary copies of the target strands. For successful PCR 
amplification, the primers are designed so that the position at which each primer 
hybridizes along a duplex sequence is such that an extension product synthesized 

35 from one primer, when separated from the template (complement), serves as a 
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template for the extension of the other primer. The cycle of denaturation, 
hybridization, and extension is repeated as many times as necessary to obtain the 
desired amount of amplified nucleic acid. 

5 In PCR methods, strand separation is normally achieved by heating the reaction to a 
sufficiently high temperature for a sufficient time to cause the denaturation of the 
duplex but not to cause an irreversible denaturation of the polymerase. Typical heat 
denaturation involves temperatures ranging from about 80. degree. C. to 105. degree. 
C. for times ranging from seconds to minutes. Strand separation, however, can be 
10 accomplished by any suitable denaturing method including physical, chemical, or 
enzymatic means. Strand separation may be induced by a helicase, for example, or 
an enzyme capable of exhibiting helicase activity. 

In addition to PCR and IVT reactions, the methods and devices of the present 
15 invention are also applicable to a number of other reaction types, e.g., reverse 
transcription, nick translation, and the like. 

Labelling before hybridization 

20 The nucleic acids in a sample will generally be labeled to facilitate detection in 
subsequent steps. Labeling may be carried out during the amplification, in vitro 
transcription or nick translation processes. In particular, amplification, in vitro 
transcription or nick translation may incorporate a label into the amplified or 
transcribed sequence, either through the use of labeled primers or the incorporation 

25 of labeled dNTPs into the amplified sequence. 

Hybridization between the sample nucleic acid and the oligonucleotide probes upon 
the array is then detected, using, e.g., epifluorescence confocal microscopy. 
Typically, sample is mixed during hybridization to enhance hybridization of nucleic 
30 acids in the sample to nucieoc acid probes on the array. 

Labelling after hybridization 



In some cases, hybridized oligonucleotides may be labeled following hybridization. 
35 For example, where biotin labeled dNTPs are used in, e.g., amplification or 
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transcription, streptavidin linked reporter groups may be used to label hybridized 
complexes. Such operations are readily integratable into the systems of the present 
invention. Alternatively, the nucleic acids in the sample may be labeled following 
amplification. Post amplification labeling typically involves the covalent attachment 
5 of a particular detectable group upon the amplified sequences. Suitable labels or 
detectable groups include a variety of fluorescent or radioactive labeling groups well 
known in the art. These labels may also be coupled to the sequences using 
methods that are well known in the art. 

10 Methods for detection depend upon the label selected. A fluorescent label is 
preferred because of its extreme sensitivity and simplicity. Standard labeling 
procedures are used to determine the positions where interactions between a 
sequence and a reagent take place. For example, if a target sequence is labeled 
and exposed to a matrix of different probes, only those locations where probes do 

15 interact with the target will exhibit any signal. Alternatively, other methods may be 
used to scan the matrix to determine where interaction takes place. Of course, the 
spectrum of interactions may be determined in a temporal manner by repeated 
scans of interactions which occur at each of a multiplicity of conditions. However, 
instead of testing each individual interaction separately, a multiplicity of sequence 

20 interactions may be simultaneously determined on a matrix. 

Means of detecting labeled target (sample) nucleic acids hybridized to the probes of 
the high density array are known to those of skill in the art. Thus, for example, where 
a colorimetric label is used, simple visualization of the label is sufficient. Where a 
25 radioactive labeled probe is used, detection of the radiation (e.g with photographic 
film or a solid state detector) is sufficient. 

In a preferred embodiment, however, the target nucleic acids are labeled with a 
fluorescent label and the localization of the label on the probe array is accomplished 
30 with fluorescent microscopy. The hybridized array is excited with a light source at 
the excitation wavelength of the particular fluorescent label and the resulting 
fluorescence at the emission wavelength is detected. In a particularly preferred 
embodiment, the excitation light source is a laser appropriate for the excitation of the 
fluorescent label. 



35 
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The target polynucleotide may be labeled by any of a number of convenient 
detectable markers. A fluorescent label is preferred because it provides a very 
strong signal with low background. It is also optically detectable at high resolution 
and sensitivity through a quick scanning procedure. Other potential labeling moieties 
5 include, radioisotopes, chemiluminescent compounds, labeled binding proteins, 
heavy metal atoms, spectroscopic markers, magnetic labels, and linked enzymes. 
Another method for labeling may bypass any label of the target sequence. The 
target may be exposed to the probes, and a double strand hybrid is formed at those 
positions only. Addition of a double strand specific reagent will detect where 
10 hybridization takes place. An intercalative dye such as ethidium bromide may be 
used as long as the probes themselves do not fold back on themselves to a 
significant extent forming hairpin loops. However, the length of the hairpin loops in 
short oligonucleotide probes would typically be insufficient to form a stable duplex. 

15 Suitable chromogens will include molecules and compounds which absorb light in a 
distinctive range of wavelengths so that a color may be observed, or emit light when 
irradiated with radiation of a particular wave length or wave length range, e.g., 
fluorescers. Biliproteins, e.g., phycoerythrin, may also serve as labels. 

20 A wide variety of suitable dyes are available, being primarily chosen to provide an 
intense color with minimal absorption by their surroundings. Illustrative dye types 
include quinoline dyes, triarylmethane dyes, acridine dyes, alizarine dyes, 
phthaleins, insect dyes, azo dyes, anthraquinoid dyes, cyanine dyes, 
phenazathionium dyes, and phenazoxonium dyes. 

25 

A wide variety of fluorescers may be employed either by themselves or in 
conjunction with quencher molecules. Fluorescers of interest fall into a variety of 
categories having certain primary functionalities. These primary functionalities 
include 1- and 2-aminonaphthalene, p.p'-diaminostilbenes, pyrenes, quaternary 

30 phenanthridine salts, 9-aminoacridines, p,p'-diaminobenzophenone imines, 
anthracenes, oxacarbocyanine, merocyanine, 3-aminoequilenin, perylene, bis- 
benzoxazole, bis-p-oxazolyl benzene, 1 ,2-benzophenazin, retinol, bis-3- 
aminopyridinium salts, hellebrigenin, tetracycline, sterophenol, 
benzimidzaolylphenylamine, 2-oxo-3-chromen, indole, xanthen, 7-hydroxycoumarin, 

35 phenoxazine, salicylate, strophanthidin, porphyrins, triarylmethanes and flavin. 
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Individual fluorescent compounds which have functionalities for linking or which can 
be modified to incorporate such functionalities include, e.g., dansyl chloride; 
fluoresceins such as 3,6-dihydroxy-9-phenylxanthhydrol; rhodamineisothiocyanate; 
N-phenyl 1 -amino-8-sulfonatonaphthalene; N-phenyl 2-amino-6- 

5 sulfonatonaphthalene; 4-acetamido-4-isothiocyanato-stilbene-2,2'-disulfonic acid; 
pyrene-3-sulfonic acid; 2-toluidinonaphthalene-6-sulfonate; N-phenyl, N-methyl 2- 
aminoaphthalene-6-sulfonate; ethidium bromide; stebrine; auromine-0,2-(9'- 
anthroyl)palmitate; dansyl phosphatidylethanolamine; N,N'-dioctadecyl 
oxacarbocyanine; N,N'-dihexyl oxacarbocyanine; merocyanine, 4- 

1 0 (3'pyrenyl)butyrate; d-3-aminodesoxy-equilenin; 1 2-(9'-anthroyl)stearate; 2- 
methylanthracene; 9-vinylanthracene; 2,2'-(vinylene-p-phenylene)bisbenzoxazole; 
p-bis>2-(4-methyl-5-phenyl-oxazolyl)!benzene; 6-dimethylamino-1 ,2-benzophenazin; 
retinol; bis(3'-aminopyridinium) 1 ,10-decandiyl diiodide; sulfonaphthylhydrazone of 
hellibrienin; chlorotetracycline; N-(7-dimethylamino-4-methyl-2-oxo-3- 

1 5 chromenyl)maleimide; N->p-(2-benzimidazolyl)-phenyl!maleimide; N-(4- 

fluoranthyl)maleimide; bis(homovanillic acid); resazarin; 4-chloro-7-nitro-2,1 ,3- 
benzooxadiazole; merocyanine 540; resorufin; rose bengal; and 2,4-diphenyl-3(2H)- 
furanone. 

20 Desirably, fluorescers should absorb light above about 300 nm, preferably about 
350 nm, and more preferably above about 400 nm, usually emitting at wavelengths 
greater than about 10 nm higher than the wavelength of the light absorbed. It should 
be noted that the absorption and emission characteristics of the bound dye may 
differ from the unbound dye. Therefore, when referring to the various wavelength 

25 ranges and characteristics of the dyes, it is intended to indicate the dyes as 
employed and not the dye which is unconjugated and characterized in an arbitrary 
solvent. 

Fluorescers are generally preferred because by irradiating a fluorescer with light, 
30 one can obtain a plurality of emissions. Thus, a single label can provide for a 
plurality of measurable events. 

Detectable signal may also be provided by chemiluminescent and bioluminescent 
sources. Chemiluminescent sources include a compound which becomes 
35 electronically excited by a chemical reaction and may then emit light which serves 
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as the detectible signal or donates energy to a fluorescent acceptor. A diverse 
number of families of compounds have been found to provide chemiluminescence 
under a variety of conditions. One family of compounds is 2,3-dihydro-1 ,-4- 
phthalazinedione. The most popular compound is luminol, which is the 5-amino 
5 compound. Other members of the family include the 5-amino-6,7,8-trimethoxy- and 
the dimethylamino>ca!benz analog. These compounds can be made to luminesce 
with alkaline hydrogen peroxide or calcium hypochlorite and base. Another family of 
compounds is the 2,4,5-triphenylimidazoles, with lophine as the common name for 
the parent product. Chemiluminescent analogs include para-dimethylamino and - 
10 methoxy substituents. Chemiluminescence may also be obtained with oxalates, 
usually oxalyl active esters, e.g., p-nitrophenyl and a peroxide, e.g., hydrogen 
peroxide, under basic conditions. Alternatively, luciferins may be used in conjunction 
with luciferase or lucigenins to provide bioluminescence. 

15 Spin labels are provided by reporter molecules with an unpaired electron spin which 
can be detected by electron spin resonance (ESR) spectroscopy. Exemplary spin 
labels include organic free radicals, transitional metal complexes, particularly 
vanadium, copper, iron, and manganese, and the like. Exemplary spin labels include 
nitroxide free radicals. 

20 

Fragmentation 

In addition, amplified sequences may be subjected to other post amplification 
treatments. For example, in some cases, it may be desirable to fragment the 
25 sequence prior to hybridization with an oligonucleotide array, in order to provide 
segments which are more readily accessible to the probes, which avoid looping 
and/or hybridization to multiple probes. Fragmentation of the nucleic acids may 
generally be carried out by physical, chemical or enzymatic methods that are known 
in the art. 

30 

Sample Analysis 



35 



Following the various sample preparation operations, the sample will generally be 
subjected to one or more analysis operations. Particularly preferred analysis 
operations include, e.g., sequence based analyses using an oligonucleotide array 
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and/or size based analyses using, e.g., microcapillary array electrophoresis. 

Capillary Electrophoresis 

5 In some embodiments, it may be desirable to provide an additional, or alternative 
means for analyzing the nucleic acids from the sample 

Microcapillary array electrophoresis generally involves the use of a thin capillary or 
channel which may or may not be filled with a particular separation medium. 

10 Electrophoresis of a sample through the capillary provides a size based separation 
profile for the sample. Microcapillary array electrophoresis generally provides a rapid 
method for size based sequencing, PCR product analysis and restriction fragment 
sizing. The high surface to volume ratio of these capillaries allows for the application 
of higher electric fields across the capillary without substantial thermal variation 

15 across the capillary, consequently allowing for more rapid separations. Furthermore, 
when combined with confocal imaging methods, these methods provide sensitivity in 
the range of attomoles, which is comparable to the sensitivity of radioactive 
sequencing methods. 

20 In many capillary electrophoresis methods, the capillaries, e.g., fused silica 
capillaries or channels etched, machined or molded into planar substrates, are filled 
with an appropriate separation/sieving matrix. Typically, a variety of sieving matrices 
are known in the art may be used in the microcapillary arrays. Examples of such 
matrices include, e.g., hydroxyethyl cellulose, polyacrylamide, agarose and the like. 

25 Gel matrices may be introduced and polymerized within the capillary channel. 
However, in some cases, this may result in entrapment of bubbles within the 
channels which can interfere with sample separations. Accordingly, it is often 
desirable to place a preformed separation matrix within the capillary channel(s), 
prior to mating the planar elements of the capillary portion. Fixing the two parts, e.g., 

30 through sonic welding, permanently fixes the matrix within the channel. 
Polymerization outside of the channels helps to ensure that no bubbles are formed. 
Further, the pressure of the welding process helps to ensure a void-free system. 



In addition to its use in nucleic acid "fingerprinting" and other sized based analyses, 
35 the capillary arrays may also be used in sequencing applications. In particular, gel 
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based sequencing techniques may be readily adapted for capillary array 
electrophoresis. 

Expression products 

5 

In addition to detection of mRNA or as the sole detection method expression 
products from the genes discussed above may be detected as indications of the 
biological condition of the tissue. Expression products may be detected in either the 
tissue sample as such, or in a body fluid sample, such as blood, serum, plasma, 
10 faeces, mucus, sputum, cerebrospinal fluid, and/or urine of the individual. 

The expression products, peptides and proteins, may be detected by any suitable 
technique known to the person skilled in the art. 

15 In a preferred embodiment the expression products are detected by means of 
specific antibodies directed to the various expression products, such as 
immunofluorescent and/or immunohistochemical staining of the tissue. 

Immunohistochemical localization of expressed proteins may be carried out by 
20 immunostaining of tissue sections from the single tumors to determine which cells 
expressed the protein encoded by the transcript in question. The transcript levels 
were used to select a group of proteins supposed to show variation from sample to 
sample, making possible a rough correlation between level of protein detected and 
intensity of the transcript on the microarray. 

25 

For example sections were cut from paraffin-embedded tissue blocks, mounted, and 
deparaffinized by incubation at 80 C° for 10 min, followed by immersion in heated oil 
at 60 C for 10 min (Estisol 312, Estichem A/S, Denmark) and rehydration.. Antigen 
retrieval is achieved in TEG (TrisEDTA-Glycerol) buffer using microwaves at 900 W. 

30 The tissue sections cooled in the buffer for 15 min before a brief rinse in tap water. 
Endogenous peroxidase activity is blocked by incubating the sections with 1% H202 
for 20 min, followed by three rinses in tap water, 1 min each. The sections are then 
soaked in PBS buffer for 2 min. The next steps are modified from the descriptions 
given by Oncogene Science Inc., in the Mouse Immunohistochemistry Detection 

35 System, XHC01 (UniTect, Uniondale, NY, USA). Briefly, the tissue sections are 
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incubated overnight at 4 C with primary antibody (against beta-2 microglobulin 
(Dako), cytokeratin 8, cystatin-C (both from Europa, US), junB, CD59, E-cadherin, 
apo-E, cathepsin E, vimentin, IGFII (all from Santa Cruz), followed by three rinses in 
PBS buffer for 5 min each. Afterwards, the sections are incubated with biotinylated 
5 secondary antibody for 30 min, rinsed three times with PBS buffer and subsequently 
incubated with ABC (avidin-biotinlylated horseradish peroxidase complex) for 30 
min, followed by three rinses in PBS buffer. 

Staining is performed by incubation with AEC (3-amino-ethylcarbazole) for 10 min. 
10 The tissue sections are counter stained with Mayers hematoxylin, washed in tap 
water for 5 min. and mounted with glycerol-gelatin. Positive and negative controls 
may be included in each staining round with all antibodies. 

In yet another embodiment the expression products may be detected by means of 
15 conventional enzyme assays, such as ELISA methods. 

Furthermore, the expression products may be detected by means of peptide/protein 
chips capable of specifically binding the peptides and/or proteins assessed. Thereby 
an expression pattern may be obtained. 

20 

Assay 

Thus, in a further aspect the invention relates to an assay for determining an ex- 
pression pattern of a colon and/or rectum cell, comprising at least a first marker 
25 and/or a second marker, wherein the first marker is capable of detecting a gene 

from a first gene group as defined above, and the second marker is capable of de- 
tecting a gene from a second gene group as defined above. 

In a preferred embodiment the assay comprises at least two markers for each gene 
30 group. 

correlating the first expression level and the second expression level to a standard 
level of the assessed genes to determine the presence or absence of a biological 
condition in the animal tissue. 

35 
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The marker (s) are preferably specifically detecting a gene as identified herein, in 
particular the genes of the tables in the examples and as discussed above. 

As discussed above the marker may be any nucleotide probe, such as a DNA, RNA, 
5 PNA, or LNA probe capable of hybridising to mRNA indicative of the expression 
level. The hybridisation conditions are preferably as described below for probes. 

In another embodiment the marker is an antibody capable of specifically binding the 
expression product in question. 

10 

Detection 

Patterns can be compared manually by a person or by a computer or other machine. 
An algorithm can be used to detect similarities and differences. The algorithm may 

15 score and compare, for example, the genes which are expressed and the genes 
which are not expressed. Alternatively, the algorithm may look for changes in 
intensity of expression of a particular gene and score changes in intensity between 
two samples. Similarities may be determined on the basis of genes which are 
expressed in both samples and genes which are not expressed in both samples or 

20 on the basis of genes whose intensity of expression are numerically similar. 

Generally, the detection operation will be performed using a reader device external 
to the diagnostic device. However, it may be desirable in some cases, to incorporate 
the data gathering operation into the diagnostic device itself. 

25 

The detection apparatus may be a fluorescence detector, or a spectroscopic 
detector, or another detector. 

Although hybridization is one type of specific interaction which is clearly useful for 
30 use in this mapping embodiment, antibody reagents may also be very useful. 



Data Gathering and Analysis 



Gathering data from the various analysis operations, e.g., oligonucleotide and/or 
35 microcapillary arrays, will typically be carried out using methods known in the art. 
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For example, the arrays may be scanned using lasers to excite fluorescently labeled 
targets that have hybridized to regions of probe arrays mentioned above, which can 
then be imaged using charged coupled devices ("CCDs") for a wide field scanning of 
the array. Alternatively, another particularly useful method for gathering data from 
5 the arrays is through the use of laser confocal microscopy which combines the ease 
and speed of a readily automated process with high resolution detection. 

Following the data gathering operation, the data will typically be reported to a data 
analysis operation. To facilitate the sample analysis operation, the data obtained by 

10 the reader from the device will typically be analyzed using a digital computer. 
Typically, the computer will be appropriately programmed for receipt and storage of 
the data from the device, as well as for analysis and reporting of the data gathered, 
i.e., interpreting fluorescence data to determine the sequence of hybridizing probes, 
normalization of background and single base mismatch hybridizations, ordering of 

15 sequence data in SBH applications, and the like. 

It is an object of the present invention to provide a biological sample which may be 
classified or characterized by analyzing the pattern of specific interactions 
mentioned above. This may be applicable to a cell or tissue type, to the messenger 
20 RNA population expressed by a cell to the genetic content of a cell, or to virtually 
any sample which can be classified and/or identified by its combination of specific 
molecular properties. 

Pharmaceutical composition 

25 

The invention also relates to a pharmaceutical composition for treating the bioligical 
condition, such as colorectal tumors. 

In one embodiment the pharmaceutical composition comprises one or more of the 
30 peptides being expression products as defined above. In a preferred embodiment, 
the peptides are bound to carriers. The peptides may suitably be coupled to a poly- 
mer carrier, for example a protein carrier, such as BSA. Such formulations are well- 
known to the person skilled in the art. 

35 The peptides may be suppressor peptides normally lost or decreased in tumor tis- 
sue administered in order to stabilise tumors towards a less malignant stage. In an- 
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other embodiment the peptides are onco-peptides capable of eliciting an immune 
response towards the tumor cells. 



In another embodiment the pharmaceutical composition comprises genetic material, 
5 either genetic material for substitution therapy, or for suppressing therapy as dis- 
cussed below. 



In a third embodiment the pharmaceutical composition comprises at least one anti- 
body produced as described above. 

10 

In the present context the term pharmaceutical composition is used synonymously 
with the term medicament. The medicament of the invention comprises an effective 
amount of one or more of the compounds as defined above, or a composition as 
defined above in combination with pharmaceutical^ acceptable additives. Such me- 
15 dicament may suitably be formulated for oral, percutaneous, intramuscular, intrave- 
nous, intracranial, intrathecal, intracerebroventricular, intranasal or pulmonal ad- 
ministration. For most indications a localised or substantially localised application is 
preferred. 

20 Strategies in formulation development of medicaments and compositions based on 
the compounds of the present invention generally correspond to formulation strate- 
gies for any other protein-based drug product. Potential problems and the guidance 
required to overcome these problems are dealt with in several textbooks, e.g. 
"Therapeutic Peptides and Protein Formulation. Processing and Delivery Systems", 

25 Ed. A.K. Banga, Technomic Publishing AG, Basel, 1995. 

Injectables are usually prepared either as liquid solutions or suspensions, solid 
forms suitable for solution in, or suspension in, liquid prior to injection. The prepara- 
tion may also be emulsified. The active ingredient is often mixed with excipients 

30 which are pharmaceutical^ acceptable and compatible with the active ingredient. 
Suitable excipients are, for example, water, saline, dextrose, glycerol, ethanol or the 
like, and combinations thereof. In addition, if desired, the preparation may contain 
minor amounts of auxiliary substances such as wetting or emulsifying agents, pH 
buffering agents, or which enhance the effectiveness or transportation of the prepa- 

35 ration. 
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Formulations of the compounds of the invention can be prepared by techniques 
known to the person skilled in the art. The formulations may contain pharmaceuti- 
cally acceptable carriers and excipients including microspheres, liposomes, micro- 
5 capsules, nanoparticles or the like. 

The preparation may suitably be administered by injection, optionally at the site, 
where the active ingredient is to exert its effect. Additional formulations which are 
suitable for other modes of administration include suppositories, and, in some 

10 cases, oral formulations. For suppositories, traditional binders and carriers include 
polyalkylene glycols or triglycerides. Such suppositories may be formed from mix- 
tures containing the active ingredient(s) in the range of from 0.5% to 10%, preferably 
1-2%. Oral formulations include such normally employed excipients as, for example, 
pharmaceutical grades of mannitol, lactose, starch, magnesium stearate, sodium 

15 saccharine, cellulose, magnesium carbonate, and the like. These compositions take 
the form of solutions, suspensions, tablets, pills, capsules, sustained release for- 
mulations or powders and generally contain 10-95% of the active ingredient(s), pref- 
erably 25-70%. 

20 The preparations are administered in a manner compatible with the dosage formula- 
tion, and in such amount as will be therapeutically effective. The quantity to be ad- 
ministered depends on the subject to be treated, including, e.g. the weight and age 
of the subject, the disease to be treated and the stage of disease. Suitable dosage 
ranges are of the order of several hundred /vg active ingredient per administration 

25 with a preferred range of from about 0.1 //g to 1000 /yg, such as in the range of from 
about 1 jjg to 300 jjg, and especially in the range of from about 10 /jg to 50 jjg. Ad- 
ministration may be performed once or may be followed by subsequent administra- 
tions. The dosage will also depend on the route of administration and will vary with 
the age and weight of the subject to be treated. A preferred dosis would be in the 

30 interval 30 mg to 70 mg per 70 kg body weight. 

Some of the compounds of the present invention are sufficiently active, but for some 
of the others, the effect will be enhanced if the preparation further comprises phar- 
maceutical^ acceptable additives and/or carriers. Such additives and carriers will be 
35 known in the art. In some cases, it will be advantageous to include a compound, 
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which promote delivery of the active substance to its target. 

In many instances, it will be necessary to administrate the formulation multiple 
times. Administration may be a continuous infusion, such as intraventricular infusion 
5 or administration in more doses such as more times a day, daily, more times a 
week, weekly, etc. 

Vaccines 

10 In a further embodiment the present invention relates to a vaccine for the 

prophylaxis or treatment of a biological condition comprising at least one expression 
product from at least one gene said gene being expressed as defined above. 

The term vaccines is used with its normal meaning, i.e preparations of immunogenic 
15 material for administration to induce in the recipient an immunity to infection or in- 
toxication by a given infecting agent. Vaccines may be administered by intravenous 
injection or through oral, nasal and/or mucosal administration. Vaccines may be 
either simple vaccines prepared from one species of expression products, such as 
proteins or peptides, or a variety of expression products, or they may be mixed vac- 
20 cines containing two or more simple vaccines. They are prepared in such a manner 
as not to destroy the immunogenic material, although the methods of preparation 
vary, depending on the vaccine. 

The enhanced immune response achieved according to the invention can be attrib- 
25 utable to e.g. an enhanced increase in the level of immunoglobulins or in the level of 
T-cells including cytotoxic T-cells wili result in immunisation of at least 50% of indi- 
viduals exposed to said immunogenic composition or vaccine, such as at least 55%, 
for example at least 60%, such as at least 65%, for example at least 70%, for exam- 
ple at least 75%, such as at least 80%, for example at least 85%, such as at least 
30 90%, for example at least 92%, such as at least 94%, for example at least 96%, 
such as at least 97%, for example at least 98%, such as at least 98.5%, for example 
at least 99%, for example at least 99.5% of the individuals exposed to said immuno- 
genic composition or vaccine are immunised. 
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Compositions according to the invention may also comprise any carrier and/or adju- 
vant known in the art including functional equivalents thereof. Functionally equiva- 
lent carriers are capable of presenting the same immunogenic determinant in es- 
sentially the same steric conformation when used under similar conditions. Func- 
5 tionally equivalent adjuvants are capable of providing similar increases in the effi- 
cacy of the composition when used under similar conditions. 



Therapy 

10 The invention further relates to a method of treating individuals suffering from the 
biological condition in question, in particular for treating a colorectal tumor. 

In one embodiment the invention relates to a method of substitution therapy, ie. 
administration of genetic material generally expressed in normal cells, but lost or 
15 decreased in biological condition cells(tumor suppressors). Thus, the invention 

relates to a method for reducing cell tumorigenicity of a cell, said method comprising 

obtaining at least one gene selected from genes being expressed in an amount two- 
fold higher in normal cells than the amount expressed in said tumor cell(tumor 
20 suppressors), 

introducing said at least one gene into the tumor cell in a manner allowing 
expression of said gene(s). 

25 The at least one gene is preferably selected individually from genes comprising a 
sequence as identified below 



RCJH04768_at 




chrom 15 no homology 


RC_Z39652_at 




Y14593 APM-1 gene adipocyte-specific se- 
cretory protein; chrom 1q21.3-q23 


RC_H30270_at 




chrom 18 PAAAA in colon & bladder no 
homology 


RC_T47089_s_at 




tenascin-X; tenascin-X precursor; unidenti- 
fied protein 


RC_W31906_at 




secretagogin; dJ501N12.8 (putative protein) 
chrom 6 


RC_AA279803_at 




chrom 2 no homology 


RC_R01646_at 




chrom 13q32.1-33.3 ; AL159152 ; homolo- 
gy to mouse Pcbpl - poly(rC)-binding 
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protein 1 


AA319615_at 




secretory carrier membrane protein; secre- 
tory carrier membrane protein 2; chrom 15 



and from 



"Human chromogranin A "'mRNA, , " , complete cds" J03915 

Human adipsin/complement factor D "mRNA," comple- M84526 
te cds 

Homo sapiens MLC-1 V/Sb isoform gene M24248 

Human aminopeptidase N/CD13 mRNA encoding M22324 
aminopeptidase "N," complete cds 

H.sapiens MT-1 1 mRNA X76717 

H.sapiens GCAP-II gene Z70295 

Human somatostatin I gene and flanks J00306 

Human YMP "mRNA," complete cds U52101 

H.sapiens mRNA for beta subunit of epithelial amiloride- X871 59 
sensitive sodium channel 

Human K12 protein precursor "mRNA," complete cds U77643 

Human sulfate transporter (DTD) "mRNA," complete cds U14528 

Human transcription factor hGATA-6 "mRNA," complete U66075 
cds. 

H.sapiens SCAD "gene," exon 1 and joining features Z80345 

Human S-lac lectin L-14-II (LGALS2) gene M87860 

Human mRNA for protein tyrosine phosphatase D15049 

H.sapiens mRNA for tetranectin X64559 

Human 1 1 kd protein "mRNA," complete cds U28249 

Human anti-mullerian hormone type II receptor precursor U29700 
"gene," complete cds 

Human heparin binding protein (HBp17) "mRNA," complete M60047 
cds 

Human ADP-ribosylation factor (hARF6) "mRNA," complete M57763 
cds 

beta -ADD=adducin beta subunit 63 kda isoform/membrane S81083 
skeleton protein, beta -ADD=adducin beta subunit 63 kda 
isoform/membrane skeleton protein {alternatively spliced, 
exon 10 to 13 region} [human, Genomic, 1851 nt, segment 
3 of 3]. 

Zinc Finger Protein Znf1 55 HG4243- 

HT4513 

Human glucagon "mRNA," complete cds J04040 

H.sapiens mRNA for hair "keratin," hHb5 X99140 

Human tubulin-folding cofactor E "mRNA," complete cds U61232 

Human integrin alpha-3 chain "mRNA," complete cds M5991 1 

Human NACP gene U46901 

H.sapiens mRNA for flavin-containing monooxygenase 5 Z47553 
(FM05) 

Human mRNA for ATF-a transcription factor X52943 

H.sapiens intestinal VIP receptor related protein mRNA X77777 
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In a preferred embodiment at least two different genes are introduced into the tumor 
cell. 

In another aspect the invention relates to a therapy whereby genes generally 
5 correlated to disease are inhibited by one or more of the following methods: 

A method for reducing cell tumorigenicity of a cell, said method comprising 

obtaining at least one nucleotide probe capable of hybridising with at least one gene 
10 of a tumor cell, said at least one gene being selected from genes being expressed in 
an amount at least one-fold lower in normal cells than the amount expressed in said 
tumor cell, and 

introducing said at least one nucleotide probe into the tumor cell in a manner 
15 allowing the probe to hybridise to the at least one gene, thereby inhibiting 

expression of said at least one gene. This method is preferably based on anti-sense 
technology, whereby the hybridisation of said probe to the gene leads to a down- 
regulation of said gene. 

20 The down-regulation may of course also be based on a probe capable of hybridising 
to regulatory components of the genes in question, such as promoters. 

The probes are preferably selected from probes capable of hybridising to a 
nucleotide sequence comprising a sequence as identified below 

25 



RC_AA609013_s 
at 


APPP 
P 


microsomal dipeptidase (also 
on 6.8k); chrom 16 


RC_AA232508_at 


APPP 
P 


CGI-89 protein; unnamed 
protein product; hypothetical 
protein 


RC_AA428964_at 


APPP 
P 


serine protease-like protease; 
serine protease homo- 
log=NES1 ; normal epithelial 
cell-specific 1 


RC_T52813_S_at 


APPP 
P 


dJ28O10.2 (G0S2 (PUTATIVE 
LYMPHOCYTE G0/G1 
SWITCH PROTEIN 2; chrom 1 


RC_AA075642_at 


APPP 
P 


gp-340 variant protein; 
DMBT1/8kb.2 protein 


RC AA007218 at 


APPP 


chrom 13 no homology 
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P 




RC_N33920_a1 


APPP 
P 


ubiquitin-like protein FAT10; 
diubiquitin; dJ271M21.6 (Diu- 
biquitin); chrom 6 


RC_N71781_at 


APPP 
P 


KIAA1199 protein, chrom 15 


RC_R67275_S_at 


APPP 
P 


alpha-1 (type XI) collagen pre- 
cursor; collagen, type XI, alpha 
1; collagen type XI alpha-1 
isoform A; chrom 1 


RC_W80763_at 


APPP 
P 


hypothetical protein; chrom 17 




APPP 
P 


chrom 7p22 A C006028 BA C : 
clone 


RC AA034499_s 
_at 


APPP 
P 


ZNF198 protein; zinc finger 
protein; FIM protein; Cys-rich 
protein; zinc finger protein 198; 
chrom 13 


RC_AA035482_at 


APPP 
P 


chrom 5; AK022505 clone; 
CalcineurinB (weakly similar) 


RC_AA024482_at 


APPP 
P 


hypothetical protein; unnamed 
protein product; chrom 1 7 


R^^93M1^at,;., ); : 


APPP 
P 


chrom 2 ; XM_004890 pep- 
tidylprolyl isomerase A (cy- 
clophilin A) 


RC_AA427737_at 


APPP 
P 


no homology 


RC_AA417078_at 


APPP 
P 


chrom 7q31; AF017104 clone 


M29873_s_at 


APPP 
P 


cytochrome P450-IIB (hllB3) 
; 19q13.1-q13.2 


RC_H27498_f_at 


AAPP 
P 




RC_T92363_s_at 


AAPP 
P 




RC_N89910_at 


AAAP 
P 




RC_W60516_at 


AAAP 
P 




RC_AA219699_at 


AAAP 
P 




RC_AA449450_at 


AAAP 
P 





Or from 



Homo sapiens (clones "MDP4," MDP7) microsomal J05257 
dipeptidase (MDP) "mRNA," complete cds 

"Homo sapiens reg gene ""homologue,"" complete L08010 
cds" 
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H.sapiens mRNA for prepro-alpha2(l) collagen 
"Human S-adenosylhomocysteine hydrolase (AHCY) 
" "mRNA,"" complete cds" 
Transcription Factor liia 

Human gene for melanoma growth stimulatory activity 
(MGSA) 

Human stromelysin-3 mRNA 

CDC25Hu2=cdc25+ homolog "[human," "mRNA," 3118 nt] 
Human mRNA for cripto protein 

Human transformation-sensitive protein (IEF SSP 3521) 
"mRNA," complete cds 

Human complement component 2 (C2) gene allele b 

H.sapiens mRNA for ITBA2 protein 

H.sapiens encoding CLA-1 mRNA 

"Human fibroblast growth factor receptor 4 (FGFR4) 

""mRNA,"" complete cds" 

Fibronectin,"" Alt. Splice 1" 

tyk2 

Human mRNA for B-myb gene 

"Human phosphofructokinase (PFKM) ""mRNA,"" complete 
cds" 

Human pre-B cell enhancing factor (PBEF) "mRNA," com- 
plete cds 

Human SH2-containing inositol 5-phosphatase (hSHIP) 
"mRNA," complete cds 

Human interleukin 8 (IL8) "gene," complete cds 

"Human lamin B receptor (LBR) ""mRNA,"" complete cds" 
H.sapiens mRNA for protein tyrosine phosphatase 
Human mRNA for unc-18 "homologue," complete cds 
H.sapiens mRNA for Zn-alpha2-glycoprotein 

"Human asparagine synthetase ""mRNA,"" complete cds" 
Human hepatitis delta antigen interacting protein A (dipA) 
"mRNA," complete cds 

Human splicesomal protein (SAP 61) "mRNA," complete 
cds 

Human protein kinase C-binding protein RACK7 "mRNA," 
partial cds 

Human MAC30 "mRNA," 3' end 

Human thrombospondin 2 (THBS2) "mRNA," complete cds 
"Human nicotinamide N-methyltransferase (NNMT) 
""mRNA,"" complete cds" 

H.sapiens mRNA for type I interstitial collagenase 
Human cytochrome b561 gene 

Human H19 RNA "gene," complete cds (spliced in sili- 
co) 

Human collagen type XVIII alpha 1 (COL18A1) "mRNA," 
partial cds 

Human clone 23733 "mRNA," complete cds. 



Z74616 
M61832 

HG4312- 

HT4582 

X54489 

X57766 

S78187 
X14253 
M86752 

L09708 
X92896 
Z22555 
L03840 

HG3044- 

HT3742 

X54667 

X13293 

U24183 

U02020 

U57650 

M28130 

L25931 

Z48541 

D63851 

X59766 

Z25521 

M27396 

U63825 

U08815 

U48251 

L19183 
L12350 
U08021 

X54925 
U29463 
M32053 

L22548 

U79274 
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In another embodiment the probes consists of the sequences identified above. 

The hybridization may be tested in vitro at conditions corresponding to in vivo 
5 conditions. Typically, hybridization conditions are of low to moderate stringency. 
These conditions favour specific interactions between completely complementary 
sequences, but allow some non-specific interaction between less than perfectly 
matched sequences to occur as well. After hybridization, the nucleic acids can be 
"washed" under moderate or high conditions of stringency to dissociate duplexes 
10 that are bound together by some non-specific interaction (the nucleic acids that form 
these duplexes are thus not completely complementary). 

As is known in the art, the optimal conditions for washing are determined empiri- 
cally, often by gradually increasing the stringency. The parameters that can be 

15 changed to affect stringency include, primarily, temperature and salt concentration. 
In general, the lower the salt concentration and the higher the temperature, the 
higher the stringency. Washing can be initiated at a low temperature (for example, 
room temperature) using a solution containing a salt concentration that is equivalent 
to or lower than that of the hybridization solution. Subsequent washing can be car- 

20 ried out using progressively warmer solutions having the same salt concentration. 
As alternatives, the salt concentration can be lowered and the temperature main- 
tained in the washing step, or the salt concentration can be lowered and the tem- 
perature increased. Additional parameters can also be altered. For example, use of 
a destabilizing agent, such as formamide, alters the stringency conditions. 

25 

In reactions where nucleic acids are hybridized, the conditions used to achieve a 
given level of stringency will vary. There is not one set of conditions, for example, 
that will allow duplexes to form between all nucleic acids that are 85% identical to 
one another; hybridization also depends on unique features of each nucleic acid. 
30 The length of the sequence, the composition of the sequence (for example, the 

content of purine-like nucleotides versus the content of pyrimidine-like nucleotides) 
and the type of nucleic acid (for example, DNA or RNA) affect hybridization. An 
additional consideration is whether one of the nucleic acids is immobilized (for ex- 
ample, on a filter). 



35 
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An example of a progression from lower to higher stringency conditions is the fol- 
lowing, where the salt content is given as the relative abundance of SSC (a salt so- 
lution containing sodium chloride and sodium citrate; 2X SSC is 10-fold more con- 
centrated than 0.2X SSC). Nucleic acids are hybridized at 42°C in 2X SSC/0.1% 
5 SDS (sodium dodecylsulfate; a detergent) and then washed in 0.2X SSC/0.1% SDS 
at room temperature (for conditions of low stringency); 0.2X SSC/0.1% SDS at 42°C 
(for conditions of moderate stringency); and 0.1X SSC at 68°C (for conditions of 
high stringency). Washing can be carried out using only one of the conditions given, 
or each of the conditions can be used (for example, washing for 10-15 minutes each 
10 in the order listed above). Any or all of the washes can be repeated. As mentioned 
above, optimal conditions will vary and can be determined empirically. 

In another aspect a method of reducing tumoregeneicity relates to the use of 
antibodies against an expression product of a cell from the biological tissue. The 
1 5 antibodies may be produced by any suitable method, such as a method comprising 
the steps of 

obtaining expression product(s) from at least one gene said gene being expressed 
as defined above for oncogenes, 

20 

immunising a mammal with said expression product(s) obtaining antibodies against 
the expression product. 

Use 

25 

The methods described above may be used for producing an assay for diagnosing a 
biological condition in animal tissue, or for identification of the origin of a piece of 
tissue. 

30 Furthermore, the invention relates to the use of a peptide as defined above for 
preparation of a pharmaceutical composition for the treatment of a biological 
condition in animal tissue. 
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Furthermore, the invention relates to the use of a gene as defined above for 
preparation of a pharmaceutical composition for the treatment of a biological 
condition in animal tissue. 

5 Also, the invention relates to the use of a probe as defined above for preparation of 
a pharmaceutical composition for the treatment of a biological condition in animal 
tissue. 

Gene delivery therapy 

10 

The genetic material discussed above for may be any of the described genes or 
functional parts thereof. The constructs may be introduced as a single DNA mole- 
cule encoding all of the genes, or different DNA molecules having one or more 
genes. The constructs may be introduced simultaneously or consecutively, each 
1 5 with the same or different markers. 

The gene may be linked to the complex as such or protected by any suitable system 
normally used for transfection such as viral vectors or artificial viral envelope, lipo- 
somes or micellas, wherein the system is linked to the complex. 

20 

Numerous techniques for introducing DNA into eukaryotic cells are known to the 
skilled artisan. Often this is done by means of vectors, and often in the form of nu- 
cleic acid encapsidated by a (frequently virus-like) proteinaceous coat. Gene deliv- 
ery systems may be applied to a wide range of clinical as well as experimental ap- 
25 plications. 

Vectors containing useful elements such as selectable and/or amplifiable markers, 
promoter/enhancer elements for expression in mammalian, particularly human, 
cells, and which may be used to prepare stocks of construct DNAs and for carrying 
30 out transfections are well known in the art. Many are commercially available. 

Various techniques have been developed for modification of target tissue and cells 
in vivo. A number of virus vectors, discussed below, are known which allow trans- 
fection and random integration of the virus into the host. See, for example, Duben- 
35 sky et al. (1 984) Proc. Natl. Acad. Sci. USA 81 :7529-7533; Kaneda et al., (1989) 
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Science 243:375-378; Hiebert et al. (1989) Proc. Natl. Acad. Sci. USA 86:3594- 
3598; Hatzoglu et al., (1990) J. Biol. Chem. 265:17285-17293; Ferry et al. (1991) 
Proc. Natl. Acad. Sci. USA 88:8377-8381 . Routes and modes of administering the 
vector include injection, e.g intravascularly or intramuscularly, inhalation, or other 
5 parenteral administration. 

Advantages of adenovirus vectors for human gene therapy include the fact that re- 
combination is rare, no human malignancies are known to be associated with such 
viruses, the adenovirus genome is double stranded DNA which can be manipulated 
10 to accept foreign genes of up to 7.5 kb in size, and live adenovirus is a safe human 
vaccine organisms. 

Another vector which can express the DNA molecule of the present invention, and 
is useful in gene therapy, particularly in humans, is vaccinia virus, which can be ren- 
15 dered non-replicating (U.S. Pat. Nos. 5,225,336; 5,204,243; 5,155,020; 4,769,330). 

Based on the concept of viral mimicry, artificial viral envelopes (AVE) are designed 
based on the structure and composition of a viral membrane, such as HIV-1 or RSV 
and used to deliver genes into cells in vitro and in vivo. See, for example, U.S. Pat. 

20 No. 5,252,348, Schreier H. et al., J. Mol. Recognit., 1995, 8:59-62; Schreier H et al., 
J. Biol. Chem., 1994, 269:9090-9098; Schreier, H., Pharm. Acta Helv. 1994, 68:145- 
159; Chander, R et al. Life Sci., 1992, 50:481-489, which references are hereby 
incorporated by reference in their entirety. The envelope is preferably produced in a 
two-step dialysis procedure where the "naked" envelope is formed initially, followed 

25 by unidirectional insertion of the viral surface glycoprotein of interest. This process 
and the physical characteristics of the resulting AVE are described in detail by 
Chander et al., (supra). Examples of AVE systems are (a) an AVE containing the 
HIV-1 surface glycoprotein gp160 (Chander et al., supra; Schreier et al., 1995, su- 
pra) or glycosyl phosphatidylinositol (GPI)-linked gp120 (Schreier et al., 1994, su- 

30 pra), respectively, and (b) an AVE containing the respiratory syncytial virus (RSV) 

attachment (G) and fusion (F) glycoproteins (Stecenko, A. A. et al., Pharm. Pharma- 
col. Lett. 1 :1 27-1 29 (1992)). Thus, vesicles are constructed which mimic the natural 
membranes of enveloped viruses in their ability to bind to and deliver materials to 
cells bearing corresponding surface receptors. 
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AVEs are used to deliver genes both by intravenous injection and by instillation in 
the lungs. For example, AVEs are manufactured to mimic RSV, exhibiting the RSV F 
surface glycoprotein which provides selective entry into epithelial cells. F-AVE are 
loaded with a plasmid coding for the gene of interest, (or a reporter gene such as 
5 CAT not present in mammalian tissue). 

The AVE system described herein in physically and chemically essentially identical 
to the natural virus yet is entirely "artificial", as it is constructed from phospholipids, 
cholesterol, and recombinant viral surface glycoproteins. Hence, there is no carry- 
10 over of viral genetic information and no danger of inadvertant viral infection. Con- 
struction of the AVEs in two independent steps allows for bulk production of the 
plain lipid envelopes which, in a separate second step, can then be marked with the 
desired viral glycoprotein, also allowing for the preparation of protein cocktail for- 
mulations if desired. 

15 

Another delivery vehicle for use in the present invention are based on the recent 
description of attenuated Shigella as a DNA delivery system (Sizemore, D. R. et al., 
Science 270:299-302 (1995), which reference is incorporated by reference in its 
entirety). This approach exploits the ability of Shigellae to enter epithelial cells and 
20 escape the phagocytic vacuole as a method for delivering the gene construct into 
the cytoplasm of the target cell. Invasion with as few as one to five bacteria can re- 
sult in expression of the foreign plasmid DNA delivered by these bacteria. 

A preferred type of mediator of nonviral transfection in vitro and in vivo is cationic 
25 (ammonium derivatized) lipids. These positively charged lipids form complexes with 
negatively charged DNA, resulting in DNA charged neutralization and compaction. 
The complexes endocytosed upon association with the cell membrane, and the DNA 
somehow escapes the endosome, gaining access to the cytoplasm. Cationic 
lipid:DNA complexes appear highly stable under normal conditions. Studies of the 
30 cationic lipid DOTAP suggest the complex dissociates when the inner layer of the 
cell membrane is destabilized and anionic lipids from the inner layer displace DNA 
from the cationic lipid. Several cationic lipids are available commercially. Two of 
these, DMRI and DC-cholesterol, have been used in human clinical trials. First gen- 
eration cationic lipids are less efficient than viral vectors. For delivery to lung, any 
35 inflammatory responses accompanying the liposome administration are reduced by 
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changing the delivery mode to aerosol administration which distributes the dose 
more evenly. 

Drug screening 

5 

Genes identified as changing in various stages of colorectal cancer can be used as 
markers for drug screening. Thus by treating colorectal cancer cells with test 
compounds or extracts, and monitoring the expression of genes identified as 
changing in the progression of colorectal cancers, one can identify compounds or 
10 extracts which change expression of genes to a pattern which is of an earlier stage 
or even of normal colorectal mucosa. 

The following are non-limiting examples illustrating the present invention. 

15 Experimentals 

We have used two different approaches to identify tumor suppressors, oncogenes 
and classifiers. The first approach was based on a spreadsheet approach in which 
we used the fold change and the pattern of expression being present or absent in 
20 the different preparations of RNA. The second approach was based on a 
mathematical approach in which we used correlation to a predefined profile as 
selection criteria based on Pearsons correlation coefficient. 

Examples 

25 

Example 1 

Quantification of gene expression using microarrays 
30 Material 

Colon tumor and normal oral resection edge biopsies were sampled from each 
patient after informed consent was obtained, and after removal of the necessary 
amount of tissue for routine pathological examination. Number of Tissue examined 
was: Normal resection edge 6, Dukes A, 5; B, 6; C, 6; D,4. The six normal tissue 
35 samples were all from Dukes A individuals. 
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FIN A from Different tumors of the same stage were combined to form each pool. 
Five isuch pools were prepared as Normal pool, Dukes A pool, Dukes B pool, Dukes 
C pool, Dukes D pool. All tumors and normal tissue specimens were from the 
sigmoid or upper rectum. 

5 

Preparation of mRNA 

Total mRNA was isolated using the RNAzol B RNA isolation method (WAK-Chemie 
Medical GMBH). Poly (A) + RNA was isolated by an oligo-dT selection step 
1 0 (Oligotex mRNA kit from Qiagen). 

Preparation of cRNA 

One jug mRNA was used as starting material for the cDNA preparation. The first and 
15 second strand cDNA synthesis was performed using the Superscript Choice System 
(Life Technologies) according to the manufacturer's instructions, except that an 
oligo-dT primer containing a 17 RNA polymerase promoter site was used. Labeled 
cRNA was prepared using the MEGAscript In Vitro Transcription kit (Ambion). Biotin 
labeled CTP and UTP (Enzo) was used in the reaction together with unlabeled 
20 NTP's. Following the IVT reaction, the unincorporated nucleotides were removed 
using RNeasy columns (Qiagen). 

Array hybridization and scanning 

25 Ten jug of cRNA was fragmented at 94°C for 35 min. In a fragmentation buffer 
containing 40 mM Tris-acetate pH 8.1, 100 mM KOAc, 30 mM MgOAc. Prior to 
hybridization, the fragmented cRNA in a 6xSSPE-T hybridization buffer (1 M NaCL, 
10 mM Tris pH 7.6, 0.005% Triton) was heated to 95 °C for 5 min. And subsequently 
to 40°C for 5 min. Before loading onto an Affymetrix probe array cartridge. The 

30 probe array was then incubated for 16 h at 40 °C at constant rotation (60 rpm). The 
washing and staining procedure was performed in the Affymetrix Fluidics Station. 
The probe array was exposed to 10 washes in 6X SSPE-T at 25°C followed by 4 
washes in 0.5xSSPE-T at 50°C. The biotinylated cRNA was stained with a 
streptavidin-phycoerythrin conjugate, 10 jug/ml (Molecular Probes, Eugene, OR) in 

35 6xSSPE-T for 30 min. at 25°C followed by 10 washes in 6xSSPE-T at 25°C. The 
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prove arrays were scanned at 560 nm using a confocal laser scanning microscope 
with an argon ion laser as the excitation source (made for Affymetrix by Molecular 
Dynamics). Following this scan, the array was incubated with an anti-avidin antibody 
and an biotinylated anti-immunoglobulin, and the streptavidin-phycoerythrin step 
5 was repeated. 

The readings from the quantitative scanning were analyzed by the Affymetrix Gene 
Expression Analysis Software. 

Normalization of data 

10 

To compare samples, normalization of the data was necessary. For that purpose we 
compared scaling to total GAPDH intensity (sum of 3', middle, 5^ probe sets) of 7000 
units with scaling to a total array intensity (global scaling) of 281850 units (averaging 
150 units per probe set). Both gave similar results with scaling factors that differed less 
1 5 than ten percent in a set of experiments. Based on this we chose the global scaling for 
all experiments. 

Example 2 

Change of transcript level during the progression of colon cancer 

20 

Biopsies from human colon tumors were analyzed as pools of tumors representing 
the different stages in the progression of the colon cancer disease. A total of 4 tumor 
pools were used, each pool made by combining four to six tumors (see materials 
and methods). To generate a normal reference material, we pooled biopsiesfrom 
25 normal colon mucosa from six volunteers. 

From the biopsies RNA was extracted, reverse transcribed to cDNA and the cDNA 
transcribed into labelled cRNA, that was incubated on the array cartridges followed 
by scanning and scaling to a global array intensity amounting to 150 units per probe 
30 set. The scaling made it possible to compare individual experiments to each other. 
To verify the reproducibility, double determinations were made in selected cases 
and showed a good correlation. 

The software GeneArray Analysis Suite 3.1 from Affymetrix, Inc. Was used to ana- 
35 lyse the array data. In this software, increased levels indicate that the transcript is 
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either up-regulated at the stated level or turned on de novo reaching a given fold 
above the background level. Decreased levels in a similar way indicate reduction or 
loss of transcript. Alterations of a single transcript during the progression of the co- 
lon cancer disease can follow several different pathways . Some of the transcript 
5 changes reflect the transition from normal cells to tumor cells, Others an increase in 
malignancy from Dukes A to Dukes B. 



Example 2 



1 0 A Finding Classifiers of and predictors etc. of colorectal cancer based on a 
spreadsheat approach. 

We used a spreadsheat to sort genes based on different parameters obtained from 
the Affymetrix analysis software. 

15 

The mRNA expression analysis on the AFFYMETRIX ARRAYS resulted in 42.843 
datasets identifying individual genes (table I) or EST's (table ll),altogether. These 
were obtained from the 6.8k Arrays ( 7.129 datasets) and the EST ARRAYS 
(35.714 datasets) 

20 

Description of the Sorting Procedure for the spreadsheat sorting, 
Per dataset the following was listed, 

Probe Set No., Present or absent in Normal tissue or the different Duke's types, 
25 gene name or homoogy or number, "AvgDiff which is the level of expression, "Abs 
Call" which determines if the gene is present (P) or absent (A) , "Diff call" which de- 
termines the alteration as increasing ( I) or decreasing (D), "fold change" the fold 
change from normal tissue expression level,, and the "sort score" which determines 
the likelihood that it is real changes ( if above 0.5). 

30 

The following steps were performed, 

1 . exclude data if "Probe Set" is an AFFX-marker (58/array or sub-array) 

2. exclude data if "Diff Call" in all 4 comparisons is "NC" (no change) 

3. exclude data if "Abs Call" in all 4 comparisons is "A" {absent) 
35 4. exclude data if three "Abs call" are "NC" and one is "Ml or MD" 

5. select data with absolute value of I sort score I arbitrarily set to >= 0,5 
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( At this step the sorting resulted in the following number of genes sorted as be- 
ing of importance, 908 Genes (1 2,7 %) and 41 55 ESTs (1 1 ,6 %) 
6. sort according to pattern of Abs Calls (e.g. PAAAA = lost from N to tumour Duke 

ABCD) 

5 7. select data with Avg Diff of >= 300 (500 for some ESTs) and /or fold change >= 
3 (>= 5 for some ESTs) 

Number of genes sorted out as being of interest after this final sorting, = 130 
Genes (1,8%), - 240 ESTs (0,7%) 



The following tables show the genes (Table I) and EST'+s (Table II) that were iden- 
tified by this approach, analyzing the hu 6.8K Fl gene array. First a list of the poten- 
tial tumor suppressors, then a list of the potential oncogenes, finally a list of genes 
that can be used to classify the different Dukes Stages. Genes that are in bold are 
1 5 those that we find are of the utmost interest. 



The table (Table III) that follow this section are based on the hu EST arrays Hu35k 
Sub A,B,C,D. These are also divided into EST's that are supposed to be expressed 
from tumor suppressors, and oncogenes, as well as from genes that can be used 
20 as classifiers of the different Dukes stages. The most intersting Est's are shown in 
bold. 
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Table I 

Fold Change in comparison to normal 



SUPPRESSOR CLASSIFIER 



Gene name 


|Acc No 


Avg Diff 


Avg Diff 


CRC classifier genes lost PAAAA or PPAAA 




N 


A |B ; ,3f- 



"Human chromogranin A""mRNA,"" complete cds" 
Human adipsin/complement factor D "mRNA," com- 
plete cds 

Homo sapiens MLC-1 V/Sb isoform gene 

Human aminopeptidase N/CD13 mRNA encoding 

aminopeptidase "N," complete cds 

H.sapiens MT-11 mRNA 

H.sapiens GCAP-II gene 

Human somatostatin I gene and flanks 

Human YMP "mRNA," complete cds 

H.sapiens mRNA for beta subunit of epithelial amiloride- 

sensitive sodium channel 

Human K12 protein precursor "mRNA," complete cds 
Human sulfate transporter (DTD) "mRNA," complete cds 
Human transcription factor hGATA-6 "mRNA," complete 
cds. 

H.sapiens SCAD "gene," exon 1 and joining features 

Human S-lac lectin L-14-II (LGALS2) gene 

Human mRNA for protein tyrosine phosphatase 

H.sapiens mRNA for tetranectin 

Human 1 1kd protein "mRNA," complete cds 

Human anti-mullerian hormone type II receptor precursor 

"gene," complete cds 

Human heparin binding protein (HBp17) "mRNA," com- 
plete cds 

Human ADP-ribosylation factor (hARF6) "mRNA," com- 
plete cds 

beta -ADD=adducin beta subunit 63 kda iso- 
form/membrane skeleton protein, beta -ADD=adducin 
beta subunit 63 kda isoform/membrane skeleton protein 
{alternatively spliced, exon 10 to 13 region} [human, 
Genomic, 1851 nt, segment 3 of 3]. 
Zinc Finger Protein Znf1 55 

Human glucagon "mRNA," complete cds 

H.sapiens mRNA for hair "keratin," hHb5 

Human tubulin-folding cofactor E "mRNA," complete cds 

Human integrin alpha-3 chain "mRNA," complete cds 

Human NACP gene 

H.sapiens mRNA for flavin-containing monooxygenase 5 
(FM05) 

Human mRNA for ATF-a transcription factor 
H.sapiens intestinal VIP receptor related protein mRNA 



J03915 


831 


lost 


lost 


M84526 


822 


lost 


lost 


M24248 


799 


lost 


lost 


M22324 


657 


lost 


lost 


X76717 


650 


lost 


lost 


Z70295 


572 


lost 


• lost 


J00306 


516 


lost 


lost 


U52101 


459 


lost 


lost 


X87159 


439 


lost 


lost 


U77643 


429 


121 


lost 


U14528 


397 


lost 


lost 


U66075 


337 


lost 


lost 


Z80345 


326 


lost 


lost 


M87860 


301 


lost 


lost 


D15049 


277 


43 


lost 


X64559 


235 


lost 


lost 


U28249 


233 


47 


lost 


U29700 


223 


lost 


lost 


M60047 


218 


lost 


lost 


M57763 


209 


lost 


lost 


S81083 


188 


lost 


lost 


HG4243- 


186 


lost 


lost 


HT451 3 








J04040 


182 


25 


lost 


X99140 


158 


lost 


lost 


U61232 


150 


lost 


lost 


M5991 1 


126 


lost 


lost 


U46901 


123 


lost 


lost 


Z47553 


110 


lost 


lost 


X52943 


104 


lost 


lost 


X77777 


93 


lost 


lost 


Acc No 


Avg Diff 


fold change to N 








g§jjj 




AF015913 


188 


Lost 




HG1067- 


501 


Lost 




HT1067 








U72342 


114 


Lost 




L11285 


1470 


-5,2 




J02854 


2047 


-4,5 




X15525 


285 


-4,4 




X53331 


1069 


-4,2 




X62535 


362 


-3,5 




M11717 


405 


-3,2 




M63379 


1594 


-3 




D10511 


198 


B 

lost 




D15050 


232 


lost 





I Gene name 
Only A Classifier ;J 
Homo sapiens SKBIHs "mRNA," complete cds. 
/gb=AF01591 3 /ntype=RNA 
Mucin (Gb:M22406) 

Human platelet activating factor "acetylhydrolase," brain 
"isoform," 45 kDa subunit (LIS1) gene 
Homosapiens ERK activator kinase (MEK2) mRNA 
Human 20-kDa myosin light chain (MLC-2) "mRNA,'' 
complete cds 

H.sapiens lysosomal acid phosphatase gene (EC 

3.1.3.2) Exon 1 (and joined CDS). 

Human mRNA for matrix Gla protein 

H.sapiens mRNA for diacylglycerol kinase 

Human heat shock protein (hsp 70) gene, complete cds. 

Human TRPM-2 protein gene 



Human gene for mitochondrial acetoacetyl-CoA thiolase 
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cds 

Human mRNA for KIAA0248 "gene," partial cds 
Homo sapiens (clone CC6) NADH-ubiquinone oxidore- 
ductase subunit "mRNA," 3' end cds 
Human phosphoglucomutase 1 (PGM1) "mRNA," 
complete cds 

Homo sapiens guanylin "mRNA," complete cds 

"Human trans-Golgi p230 ""mRNA,"" complete cds" 
H. sapiens mRNA for vacuolar proton "ATPase," subunit 
D 

H.sapiens mRNA for 3-hydroxy-3-methylglutaryl 
coenzyme A synthase 

Human mRNA for KIAA0018 "gene," complete cds 

"Mucin ""1, Epithelial,"" Alt. Splice 9" 

H.sapiens mRNA for L-3-hydroxyacyl-CoA dehydrogena- 



se! 



Homo sapiens colon mucosa-associated (DRA) 
"mRNA," complete cds 
Human Ig J chain gene 

Human selenium-binding protein (hSBP) "mRNA " 

complete cds. /gb=U29091 /ntype=RNA 

H.sapiens mRNA for sigma 3B protein 

Human ERK1 mRNA for protein serine/threonine 

kinase 

Human mRNA for mitochondrial 3-oxoacyl-CoA "thi- 
olase," complete cds 

"Biliary ""Glycoprotein,"" Alt. Splice ""5,"" A" 

Human AQP3 gene for aquaporine 3 (water "channel)," 
partail cds 

Human CD14 mRNA for myelid cell-specific leucine-rich 
glycoprotein 

Human thioredoxin "mRNA," nuclear gene encoding 

mitochondrial "protein," complete cds 

Human mitochondrial ATPase coupling factor 6 subunit 

(ATP5A) "mRNA," complete cds 

"Human MHC class II H LA-DP light chain ""mRNA,"" 

complete cds" 

Human mRNA for early growth response protein 1 
(hEGR1) 

Human mRNA for mitochondrial 3-ketoacyl-CoA thiolase 
beta-subunit of trifunctional "protein," complete cds 
Homo sapiens laminin-related protein (LamA3) "mRNA," 
complete cds 

H.sapiens mRNA for selenoprotein P 

Human hkf-1 "mRNA," complete cds 

Homo sapiens nuclear domain 10 protein (ndp52) 

"mRNA," complete cds 

Human X104 "mRNA," complete cds 

H. sapiens cDNA for RFG 

H.sapiens mRNA for Progression Associated Protein 
Human liver "2,4-dienoyl-CoA" reductase "mRNA," com- 
plete cds 

Human A33 antigen precursor "mRNA," complete 
cds 

H.sapiens pS2 protein gene 

Human RASF-A PLA2 "mRNA," complete cds 

Homo sapiens pstl mRNA for pancreatic secretory inhi- 
bitor (expressed in neoplastic tissue). 
Human CO-029 

wmpwf&wm: ; *"•;• 

Human complement component C3 "mRNA," alpha 

and beta "subunits," complete cds 

H.sapiens mRNA for adenosine "triphosphatase," 

calcium 

Human skeletal muscle LIM-protein SLIM1 "mRNA," 
complete cds 

Human platelet-derived growth factor receptor alpha 



D87435 
L04490 


374 lost 
683 lost 


M83088 


1096 lost 


M97496 

U41740 
X71490 


4983 lost 

131 lost 
414 lost 


X83618 


2196 lost 


HG371- 

HT26388 

X96752 


377 -7 7 

3296 -4J 

252 -3 


I I 


N |C 


L02785 


2978 Lost 


M12759 
U29091 


2193 Lost 
1849 Lost 


X99459 
X60188 


722 Lost 
576 Lost 


D16294 


529 Lost 


HG2850- 

HT4814 

AB001325 


489 Lost 
413 Lost 


\y A 1 o **» A 

X13334 


413 Lost 


U78678 


41 1 Lost 


M37104 


373 Lost 


M57466 


327 Lost 


X52541 


281 Lost 


D16481 


268 Lost 


L34155 


252 Lost 


Z11793 
D76444 
U22897 


232 Lost 
211 Lost 
150 Lost 


L27476 
X77548 
Y07909 
U49352 


149 Lost 
130 Lost 
128 Lost 
101 Lost 


U79725 


1650 -6,9 


X52003 
M22430 
Y00705 


4298 -6 
4983 -5 8 
344 -3,1 


M35252 


3500 -3 


1 1 


N f||Pitft 


K02765 


744 lost 


Z69881 


439 lost 


U60115 


281 lost 


M21574 


187 lost 
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(PDGFRA) "mRNA," complete cds 

Human mRNA for KIAA0247 "gene," complete cds D87434 172 lost 

Human mRNA for KIAA01 71 "gene," complete cds D79993 151 lost 

Human Down syndrome critical region protein (DSCR1) U28833 150 lost 
"mRNA," complete cds 

Human Ki nuclear autoantigen "mRNA," complete cds U11292 125 lost 

gB^jaSffier . jgL_^L__ „.MlZ J I " l A W :\ 

Homo sapiens chromosome 16 BAC clone CIT987SK- AF001548 3513 -3,6 -4,3 
815A9 complete sequence. 

Human mRNA for ATP synthase alpha "subunit," com- D14710 3580 -3,8 -5,6 
plete cds 

IMIssgier-" " "~: -J | N |B |C -\y.^,.\ 

Human mRNA for IgG Fc binding "protein," complete D84239 3755 -19,3 -7,1 
cds 

H. sapiens mRNA for carcinoembryonic "antigen," X98311 2456 -12 -6,5 
CGM2 

"Homo sapiens (clone lamda-hPEC-3) phosphoenol- L05144 2630 -7,6 -14,7 

pyruvate carboxy kinase (PCK1) ""mRNA,"" complete 

cds" 

Human 11-beta-hydroxysteroid dehydrogenase type 2 U26726 1865 -7,1 -4,7 
"mRNA," complete cds 

"Human intestinal mucin (MUC2) ""mRNA,"" complete L21998 7803 -5,5 -4,2 
cds" 

Human mRNA for KIAA0106 "gene," complete cds D14662 766 -4,7 -3,2 

metaltothionein V00594 5417 -4 -6,3 



Table l(cont.) 



Fold Change in comparison to normal 


Oncogene CLASSIFIER 




Gene name 


Acc No 


Avg 
Diff 


Avg Diff 










CRC classifier gen&]0j!^ 




A 


B :.j 



Homo sapiens (clones "MDP4," MDP7) microsomal 
dipeptidase (MDP) "mRNA," complete cds 
"Homo sapiens reg gene ""homologue,"" complete 
cds" 

H.sapiens mRNA for prepro-alpha2(l) collagen 
"Human S-adenosylhomocysteine hydrolase (AHCY) 
M,, mRNA, ,,,, complete cds" 
Transcription Factor liia 

Human gene for melanoma growth stimulatory activi- 
ty (MGSA) 

Human stromelysln-3 mRNA 

CDC25Hu2=cdc25+ homolog "(human," "mRNA," 3118 
nt] 

Human mRNA for cripto protein 

Human transformation-sensitive protein (IEF SSP 

3521) "mRNA," complete cds 

Human complement component 2 (C2) gene allele b 

H.sapiens mRNA for ITBA2 protein 

H.sapiens encoding CLA-1 mRNA 

"Human fibroblast growth factor receptor 4 (FGFR4) 

""mRNA,"" complete cds" 

"""Fibronectin,"" Alt. Splice 1" 

tyk2 

Human mRNA for B-myb gene 

"Human phosphofructokinase (PFKM) ""mRNA,"" com- 
plete cds" 

Human pre-B cell enhancing factor (PBEF) "mRNA," 
complete cds 

Human SH2-containing inositol 5-phosphatase (hSHIP) 



J05257 


1606 


1403 


gained 


L08010 


1165 


294 


gained 


Z74616 


1003 


905 


gained 


M61832 


882 


817 


gained 


HG4312- 


837 


948 


gained 


HT4582 








X54489 


731 


330 


gained 


X57766 


643 


1116 


gained 


S78187 


603 


627 


gained 


X14253 


532 


293 


gained 


M86752 


529 


866 


gained 


L09708 


515 


625 


gained 


X92896 


444 


459 


gained 


Z22555 


422 


549 


gained 


L03840 


359 


276 


gained 


HG3044- 


354 


261 


gained 


HT3742 








X54667 


336 


352 


gained 


X 13293 


333 


322 


gained 


U24183 


296 


426 


gained 


U02020 


276 


242 


gained 


U57650 


254 


315 


gained 
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"mRNA," complete cds 

Human interleukin 8 (IL8) "gene," complete cds M28130 251 609 gained 

"Human lamin B receptor (LBR) ""mRNA,"" complete L25931 239 193 gained 
cds" 

H.sapiens mRNA for protein tyrosine phosphatase Z48541 228 151 gained 

Human mRNA for unc-18 "homologue," complete cds D63851 217 198 gained 

H.sapiens mRNA for Zn-alpha2-glycoprotein X59766 215 156 gained 

Z25521 215 127 gained 

"Human asparagine synthetase ""mRNA,"" complete cds" M27396 212 195 gained 

Human hepatitis delta antigen interacting protein A (dipA) U63825 21 1 231 gained 
"mRNA," complete cds 

Human splicesomal protein (SAP 61) "mRNA," complete U08815 157 201 gained 
cds 

Human protein kinase C-binding protein RACK7 U48251 129 71 gained 
"mRNA," partial cds 

Human MAC 30 "mRNA," 3' end L1 91 83 1 28 224 gained 

Human thrombospondin 2 (THBS2) "mRNA," complete L12350 111 126 gained 
cds 

"Human nicotinamide N-methyltransferase (NNMT) U08021 107 261 gained 
""mRNA,"" complete cds" 

H.sapiens mRNA for type I interstitial collagenase X54925 105 123 gained 

Human cytochrome b561 gene U29463 85 85 gained 

Human H19 RNA "gene," complete cds (spliced in M32053 72 4498 gained 
silico) 

Human collagen type XVIII alpha 1 (COL18A1) "mRNA," L22548 67 275 gained 
partial cds 

Human clone 23733 "mRNA," complete cds. U79274 absent 162 gained 



Gene name 


ACC NO 


Avg 
Diff 


fold change to N 


Only A^J^fieT 1|; § ;fpr 




A 






Human migration inhibitory factor-related protein 8 
(MRP8) "gene," complete cds 


M21005 


120 GAINED 




Human acyloxyacyl hydrolase "mRNA," complete cds 


M62840 


130 GAINED 




Human PEP19 (PCP4) "mRNA," complete cds 


U52969 


174 GAINED 




H.sapiens Humig mRNA 


X72755 


118 GAINED 




H.sapiens PISSLRE mRNA 


X78342 


125 GAINED 




H.sapiens mRNA for twist "protein," partial. /gb=Y1 1 1 80 
/ntype=RNA 


Y11180 


121 


GAINED 




Human mRNA forTGF-beta superfamily "protein," com- 
plete cds 


AB000584 


1372 


3,5 




Human mRNA for "MSS1 ," complete cds 


D11094 


292 


3,1 




Human complement factor B "mRNA," complete cds 


L15702 


2082 


3,3 




"Homo sapiens GTP-binding protein (RAB2) ""mRNA,"" 
complete cds" 


M28213 


289 


3,1 




Human translational initiation factor 2 beta subunit (elF-2- 
beta) "mRNA," complete cds 


M29536 


956 


4,1 




Human E16 "mRNA," complete cds 


M80244 


278 


3,8 




IEX-1=radiation-inducible immediate-early gene "[hu- 
man;' "placenta," mRNA "Partial," 1223 nt] 


S81914 


1531 


3,6 




Human CDC16Hs "mRNA," complete cds 


U18291 


244 


6,1 




Human DD96 "mRNA," complete cds 


U21049 


625 


3,2 




Human (memc) "mRNA," 3'UTR. /gb=U30999 
/ntype=RNA 


U30999 


256 


3,8 




"Human ubiquitin-conjugating enzyme (UBE2I) 
""mRNA,"" complete cds" 


U45328 


448 


10,6 




"Human fetal brain glycogen phosphorylase B ""mRNA,"" 
complete cds" 


U47025 


2349 


3,7 




"Human BTG2 (BTG2) ""mRNA,"" complete cds" 


U72649 


527 


5,2 




Human jun-B mRNA for JUN-B protein 


X51345 


1350 


4,6 
























Human adipocyte lipid-binding "protein," complete cds 


J02874 


268 


GAINED 


Human A1 protein "mRNA," complete cds 


U29680 j 


102 


GAINED 


Human LGN protein "mRNA," complete cds 


U54999 


110 


GAINED 


Human skeletal muscle LIM-protein SLIM2 "mRNA," 
partial cds 


U60116 


109 


GAINED 


Human mRNA for alphal-acid glycoprotein (orosomuco- 
id) 


X02544 


156 


GAINED 


Human mRNA for fibronectin receptor alpha subunit 


X06256 


46 


GAINED 
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n. sapiens m-oac^ 1 mKiNA 




07R 


GAINED 


H. sapiens mRNA for fibulin-2 


YQOvl QA 


OQyt 

^tJ4 


GAINED 


H. sapiens 5T4 gene for 5T4 Oncofetal antigen 




•1 K.O 


GAINED 


Homo sapiens mRNA for osteoblast specific factor 2 

(Uor-^OSj 


D 13666 


324 


7,6 




iviaczo 


UriQQ-7 UTQfl7 


0770 


3,3 


"Human lysozyme ""mRNA,"" complete cds with an Alu 
repeat in the 3' flank" 


J03801 


920 


3,7 


Human metalloproteinase (HME) "mRNA, 11 complete cds 


LzooUo 


/y4 


7,4 


Human alpha-1 collagen type IV gene, exon 52. 


M26576 


610 


4,9 


Human lumican "mRNA," complete cds 


U21 128 


1 1 05 


4,1 


Human mRNA for fibronectin (FN precursor) 


X02761 


4181 


5,5 


Human mRNA fragment for elongation factor TU (N- 
terminus). /gb=X03689 /ntype=RNA 


X03689 


3515 


3,1 


Human mRNA for type IV collagen alpha -2 chain 


X05610 


1531 


3 


Human mRNA for collagen VI alpha-1 C-terminal globu- 
lar domain 


X 15880 


2062 


3,5 


"H. sapiens," gene for Membrane cofactor protein 


X59405 


272 


3,4 


H. sapiens SOD-2 gene for manganese superoxide dis- 
mutase. /gb=X65965 /ntype=DNA /annot=exon 


X65965 


234 


3,1 


H. sapiens NMB mRNA 


X76534 


338 


3,3 


H. sapiens vimentin gene 


Z19554 


3472 


3,2 








Only C Classifier ,.• 




. . .C ; : K 




Ribosomal Protein L39 Homolog 


HG2874- 
HT301 8 


102 


GAINED 


Homo sapiens (clone d2-1 15) kappa opioid receptor 
(ukkm ) mKiNA, complete COS 


L37362 


168 


GAINED 


Human kell blood group protein mRNA 


ivio4yo4 




GAINED 




1 I7Q ^ (57 


0/4 


GAINED 


Human cancellous bone osteoblast mRNA for serin 
protease with IGF-binding "motif," complete cds 


D87258 


504 


3,4 




Human interferon-inducible protein 27-Sep "mRNA," 
complete cds 


1 f\A Ad A 

JU4 1 o4 


77 *1 7 


3,8 


"Human sickle cell beta-globin ""mRNA,"" complete cds" 


M2o07y 


3090 


4,6 




M29277 


1 588 


3,7 


"Human spermidine synthase ""mRNA,"" complete cds" 


M34338 


866 


4,1 


Human copine I "mRNA," complete cds 


U83246 


2079 


3,7 


Only D Classifier - ^-mM :: ; 1 . ■ M il ill 




fistli 






Homo sapiens FRG1 "mRNA," complete cds 


L76159 


73 


GAINED 


Human cyclin protein "gene," complete cds 


M15796 


149 


GAINED 


Human U2 small nuclear RNA-associated B" antigen 
"mRNA," complete cds 


M15841 


194 


GAINED 


Human mRNA export protein Rae1 (RAE1) "mRNA," 
complete cds. 


U84720 


193 


GAINED 


Human protease-activated receptor 3 (PAR3) "mRNA," 
complete cds. 


U92971 


142 


GAINED 


H. sapiens mRNA for mediator of receptor-induced toxi- 
city 


X84709 


200 


GAINED 


H. sapiens RFXAP mRNA 


Y12812 


230 


GAINED 


Human mRNA for "Qip1 ," complete cds 


AB002533 


8881 


2,7 




Human mRNA for transferrin receptor 


X01060 


557 


3 


"metastasis-associated gene ""[human,"" highly metasta- 
tic lung cell subline ""Anip^y],"" mRNA ""Partial;'" 978 
nt]" 


S79219 


216 


4 










■ 3k£ HI 






f: Umi 


Human chaperonin 1 0 "mRNA," complete cds 


U07550 


50 


4,1 


3,3 


H. sapiens RING4 cDNA 


X57522 


73 


4,9 


5,4 


H. sapiens genes TAP1 , TAP2, LMP2, LMP7 and DOB. 


X66401 


134 


3,2 


3,1 


H. sapiens mRNA for alpha 4 protein 


Y08915 


96 


3,7 


3,6 


Homo sapiens interleukin-1 receptor-associated kinase 
(IRAK) "mRNA," complete cds 


L76191 


285 


3,1 


3,1 


"Human von Willebrand factor ""mRNA,"" 3' end" 


M10321 


84 


3,7 


4,1 


Human chromosome segregation gene homolog CAS 
"mRNA," complete cds 


U33286 


86 


4,8 


3,6 


Human Bruton's tyrosine kinase-associated protein-135 


U77948 


68 


3,4 


4,9 
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"mRNA," complete cds. 










"Human KH type splicing regulatory protein KSRP 
""mRNA,"" complete cds." 


U94832 


52 


3,2 


3,2 


H. sapiens ADE2H1 mRNA showing homologies to SAI- 
CAR synthetase and AIR carboxylase of the purine 
pathway (EC "6.3.2.6," EC 4.1.1.21) 


X53793 


40 


3 


3,1 








N 


B C 




oiODin, oeia 


no i*fzo* 
HT1428 




3,1 


4,3 




"Human alpha-1 collagen type I ""gene/"' 3* end" 


M55998 


2706 


3,1 


3,7 




n. sapiens mKiNM tor protein 


A / UDOj 


1 7n 
I ou 


4,5 


4,5 




numan hi iainm ror coiiagen Dinaing protein <c, uoin- 
plete cds" 


L/OO 1 f H 


I O I 


8.1 


6,1 




Unman CDADO/nctonno^tin "mDMA 11 rrimnlota rHc 

numan or Ar\0/osieoneciiii rr irsiNM, uuniuitJLc uuo 




358 




Q 




Human PRAD1 mRNA for evelin 


X59798 


263 


3,3 


3,4 




















•"• N 


■ M ■ : 




c 


Human transforming growth factor-beta induced gene 
product (BIGH3) "mRNA," complete cds 


M77349 


426 


4,7 


6,7 


4,4 


"Human breast epithelial antigen BA46 ""mRNA,"" com- 


U58516 


169 


3,3 


3,2 


4,2 


plete cds" 














X57351 


460 


4,8 


3,5 


3,7 


H. sapiens NGAL gene 


X99133 


327 


8,3 


3,1 


4,8 


Human mRNA for MDNCF (monocyte-derived neu- 


Y00787 


87 


5 


9,2 


13,4 


trophil chemotactic factor) 












H. sapiens EF-1 delta gene encoding human elongation 
factor-1 -delta 


Z21507 


198 


4,4 


6,8 


4,5 


H. sapiens mRNA for prepro-alphal(l) collagen 


Z74615 


285 


5 


8,2 


6,1 


Nuclear Factor Nf-116 


HG3494- 
HT3688 


246 


4,3 


4,4 


4,2 




U29175 


62 


4,3 


3,6 


4,4 



ABCD Classifier 


mn 


N 


A BCD 


"HNL= neutrophil lipocalin ""[human,"" ovarian can- 
cer cell line ""OC6,"" mRNA ""Partial,"" 534 nt]. 
/gb=S75256 /ntype=RNA" 


S75256 


361 


8,8 


4,3 


7.7 


9 
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-0,4 


cm" 


-5,64 


-0,52 


14,5 




I r- O 




■ 10 
f co" 




■ LO" 
i CO 




Q 


a 


Q 


Q 


Q 




< 
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< 


< 
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CO 


CO 


LO 


-225 




-<* 

CO 

i 


-2,73 




-6,76 


15,8 




-2,8 


CO 


-13,8? 


i 00 
i of 


34,8 




Q 


a 


Q 


Q 


Q 




< 


< 


< 


< 


< 




CO 
CO 
CM 




CO 


CO 
CO 


-103 




-1,18 


-1,4 


-8,86 


-6,91 


17,5 




CO 

c v 


CO 


-10,1 


10,0 


39,6 




Q | 


o 


Q 


o 






< 


< 


< 


< 
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CO 


T— 

CM 


CM 
05 




CO 
OJ 

T— 

1 




CM 


LO 
CO 
CM 


' CO <*> 

co" 


-6,53 


h- 

LO" 




-4,8 


<J) 

co" 


17,7 


i CD 


/ CO 




a 


Q 


Q 


Q 


a 




< 


< 


< 


< 
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CO 

t 

CM 


CO 


LO 

• 


CO 
LO 


-100 




0- 


Q. 


Q_ 


Q_ 


Q_ 






CO 

s 


o 

CO 


CO 
LO 


CO 
CO 
LO 


74775 3' 
similar to 
gb:M2581 
3 FIBRI- 
NOGEN- 
LIKE 

PROTEIN 

(HUMAN). 


zc76c03.s 
1 Pancre- 
atic Islet 
Homo 
sapiens 
cDNA 
clone 
328228 3'. 


zs92a11.s 

NCI CGA 

P_GCB1 

Homo 

sapiens 

cDNA 

clone 

IMAGE70 

4924 3'. 


ye79f11.s1 

Homo 

sapiens 

cDNA 

clone 

123981 3". 


zk87c05.s 

1 Soares 

pregnant 

uterus 

NbHPU 

Homo 

sapiens 

cDNA 

clone 

489800 3'. 


EST21862 
Adrenal 




secretagogin; 
dJ501N12.8 (putative 
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B. Finding potential classifier genes for colorectal cancer (Dukes A, B, C & D) 
by sorting according to Pearson correlation coefficient 

Primary selection criteria for classifier genes: 

5 

1. All genes with a score of A (AbsCall) or NC (DiffCall) for all groups (N, A, B, C & D) were removed. 

2. Genes with a fold change below 5 and a Sort Score below 0.5 were removed. 

3. If DiffCall were NC for a gene in a particular experiment the FC were set to 1 . 

1 0 Secondary selection criteria for classifier genes: 

Based on Pearson correlation coefficient (figure 1) genes similar to a predefined profile were selected. 

15 

r = 



ij[nZX 2 - ( ZJ*T ) 2 ] [nJY 2 - ( 27) : 



25 



30 



Figure 1: Pearson correlation coefficient (r) 
Classifier genes for Dukes A, B, C and D: 



Table III 



A classifiers (Profile 1, 0, 0, 0), Pearson correlations approach 
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H. sapiens gene for major histocompatibility complex encoded proteasome subunit 
LMP7. 

RAD23A gene (human RAD23A homolog) extracted from Homo sapiens DNA from 
chromosome 19p13.2 cosmids "R31240," R30272 and R28549 containing the 
"EKLF," "GCDH," "CRTC," and RAD23A "genes," genomic sequence 
Human mRNA for KIAA0219 "gene," partial cds 

H. sapiens clathrin light chain a gene 

Human acid sphingomyelinase (ASM) "mRNA," complete cds 

"H. sapiens NOS2 ""gene,"" exon 27 /gb=X85781 /ntype=DNA /annot=exon" 

"Human gro-beta ""mRNA,"" complete cds" 

Human placenta (Diff33) "mRNA," complete cds 

Human mRNA for macrophage inflammatory protein-2beta (MIP2beta) 

Human kinase Myt1 (Myt1) "mRNA," complete cds. 

Mucin (Gb:M22406) 



EST: 

RC F03077 f 



Chromosome 17, clone hRPC.15 



SUBSTITUTE SHEET (RULE 26) 



WO 01/49879 



128 



PCT/DK00/00744 



RC_AA599199 

RC_AA207015 

RC_AA234916 

RC_N92239_a 

RC_N93958_s 

U95301_at 

RC_AA426330 

RC.AA024658 

RC H88540 a 



Alu seq 

clone RP4-733M16 on chromosome 1p36.1 1-36.23 
Chromosome 19 clone CTC-461H2 
Wnt inhibitory factor-1 (WIF-1), chromosome 12 
Phospholipase A2, group X (PLA2G10), 
Phospholipase A2, group X (PLA2G10), 
Chromosome 17, clone hRPC.1 110_E_20 
clone SCD-254N2 (UWGC:rg254N02) from 6p21 
heat shock protein 90, 1q21.2-q22 



B classifiers (Profile 0,1.0, 0) 



Hu68O0: 



5 



U57316_at 


Human GCN5 (hGCN5) "gene," complete cds 


X66839_at 


H. sapiens MaTu MN mRNA for p54/58N protein 


J04599 at 


Human hPf^l mRNA pnmriinn honp Qmall nrrttprinlv/PAn I 'Yhinlvran^ " mmnlptp rH^ 


X57579_s_at 


H. sapiens activin beta-A subunit (exon 2) 


J02874_at 


Human adipocyte lipid-binding "protein," complete cds 


M11749_at 


Human Thy-1 glycoprotein "gene," complete cds 


U06863_at 


Human follistatin-related protein precursor "mRNA," complete cds 


uo i u i u_s_ai 


"Human nicotinamide N-methyltransferase ""gene,"" exon 1 and 5' flanking region. 




1 y U — <J \j 1 \J 1 \J f 1 1 Ly [JC — VJ \ ^ir \ 1 al II l\J l — CAvl 1 


U08021 at 


"Human nicotinamide N-methyltransferase (NNMT) ""mRNA,"" complete cds" 


nos3044-M I o ( 4<£_S_at 


rioronectin, Ait. opuce i 


X02761_s_at 


Human mRNA for fibronectin (FN precursor) 


X02544_at 


Human mRNA for alphal-acid glycoprotein (orosomucoid) 


M62505_at 


Human C5a anaphylatoxin receptor "mRNA," complete cds 


J05070_at 


Human type fV collagenase "mRNA," complete cds 


U16306_at 


Human chondroitin sulfate proteoglycan versican V0 splice-variant precursor peptide 




"mRNA," complete cds 


M14218_at 


Human argininosuccinate lyase "mRNA," complete cds 


L77567_s_at 


"Homo sapiens mitochondrial citrate transport protein (CTP) ""mRNA,"" 3' end" 


M63391_rna1_at 


Human desmin gene, complete cds. 


D13643_at 


Human mRNA for KIAA0018 "gene," complete cds 


D79985_at 


Human mRNA for KIAA0163 "gene," complete cds 


EST: 




M63262_at 


5-lipoxygenase activating protein (FLAP), 13q12 


R67290_at 


Interleukine 14 


N36619_at 




L19161_at 


Translation initiation factor 2, subunit 3", Xp22. 2-22.1 


RC_AA496035 


Chromosome 1? (TIGR) 


L29217_s_at 


CDC-like kinase 3 (CLK3), 15q24 


RC_W73194_a 


Dermatoponin, 1q12-q23 


RC_N69507_a 


Hypothetical protein PR01847 (Alu according to TIGR) 


RC H15814 s 


adipose most abundant gene transcript 1 


M84526_at 


D component of complement (adipsin) 



C classifiers (Profile 0, 0, 1. 0) 



1 0 Hu6800: 

M20681_at Human glucose transporter-like protein-Ill "(GLUT3)," complete cds 

D50914_at Human mRNA for KIAA0124 "gene," partial cds 

L37362_at Homo sapiens (clone d2-1 1 5) kappa opioid receptor (OPRK1) "mRNA," complete 
cds 

X661 14_rna1_at H. sapiens gene for 2-oxoglutarate carrier protein. 

M32053_at Human H19 RNA "gene," complete cds (spliced in silico) 
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Y00787_s_at 

U64444_at 

X95325_s_at 

X02419__rna1_s_at 

X57522_at 

AB001325_at 

AB002315_at 

L12760_s_at 

M80899 at 



Human mRNA for (VIDNCF (monocyte-derived neutrophil chemotactic factor) 
Human ubiquitin fusion-degradation protein (UFD1L) "mRNA," complete cds 
H. sapiens mRNA for DNA binding protein A variant 
H. sapiens uPA gene 
H.sapiens RING4 cDNA 

Human AQP3 gene for aquaporine 3 (water "channel)," partail cds 

Human mRNA for KIAA0317 "gene," complete cds. /gb=AB002315 /ntype=RNA 

"Human phosphoenolpyruvate carboxykinase (PCK1) ""gene,"" complete cds with 
repeats" 

Human novel protein AHNAK "mRNA," partial sequence 



EST: 

RC_AA1 22350 

AA374109_at 

RC_AA621755 

RC_AA442069 

RC_T40767_a 

RC_AA488655 

RC_AA398908 

RC.AA447764 

RC N69136 a 



Chromosome 8 

spondin 2, extracellular matrix protein, chromosome 4 

Transcription factor Dp-2, 3q23 

sodium channel 2, 12q12 

Chromosome 19 

Mus? 

Hypothetical protein, chromosome 4 



D classifiers (Profile 0, 0, 0, 1) 



X 17644. 


_s_at 


Y12812. 


_at 


X60486_ 


_at 


X52221. 


_at 


L06175_ 


at 


Z48481_ 


.at 


X54232_ 


.at 


L08010_ 


at 


L27706_ 


at 


L15533_ 


jrna1_at 


X51408. 


.at 


K02765. 


.at 


U38904. 


_at 



Human GST1-Hs mRNA for GTP-binding protein 

H.sapiens RFXAP mRNA 

H.sapiens H4/g gene for H4 histone 

H.sapiens ERCC2 "gene," exons 1 & 2 (partial) 

Homo Sapiens P5-1 "mRNA," complete cds 

H.sapiens mRNA for membrane-type matrix metalloproteinase 1 

Human mRNA for heparan sulfate proteaglycan (glypican) 

"Homo sapiens reg gene ""homologue,"" complete cds" 

Human chaperonin protein (Tcp20) gene complete cds 

Homo sapiens pancreatits-associated protein (PAP) gene, complete cds. 

Human mRNA for n-chimaerin 

Human complement component C3 "mRNA," alpha and beta "subunits," complete 
cds 

Human zinc finger protein C2H2-25 "mRNA," complete cds 



EST: 

RC_AA121433 
RC_N91920_a 
RC_AA621601 
RC_AA454020 
RC Z39652 a 



Axin, chromosome 16 

RB protein binding protein, chromosome 16 

GTP-binding protein Rab36, chromosome 17 

NADPH quinone oxidoreductase homolog; p53 induced, chromosome 2 
APM-1 gene, chromosome 18 



0 Conclusion. 

As can be seen from these tables we have identified a number of genes and EST's, based on two different apo- 
roaches, that we believe are either of importance for initiating and developing colorectal cancer, or can be used to 
classify the disease. These genes and EST's are subdivided into potential tumor suppressors that have a reduced 
5 level during progression of the disease - or that even completely lose their expression; potential oncogenes that 

increase their level during disease progression , or even are gained de novo, not being expressed at early stages or 
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in normal mucosa; and finally classifiers of the disease that can be used to identify the different Dukes stages , e.g. 
being only expressed at a certain stage. 
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Claims: 

1 . A method of determining the presence or absence of a biological condition in 
5 animal tissue 

comprising collecting a sample comprising cells from the tissue and/or expres- 
sion products from the cells, 

10 assaying a first expression level of at least one gene from a first gene group, 

wherein the gene from the first gene group is selected from genes expressed in 
normal tissue cells in an amount higher than expression in biological condition 
cells, and/or 

15 assaying a second expression level of at least one gene from a second gene 

group, wherein the second gene group is selected from genes expressed in a 
normal tissue cells in an amount lower than expression in biological condition 
cells, 

20 correlating the first expression level to a standard expression level for normal 

tissue, and/or the second expression level to a standard expression level for 
biological condition cells to determine the presence or absence of a biological 
condition in the animal tissue. 

25 2. The method of claim 1, wherein the animal tissue is selected from epithelial tis- 
sue. 

3. The method of claim 2, wherein the animal tissue is selected from epithelial tis- 
sue in the gastro-intestinal tract. 



30 



4. The method of claim 3, wherein the animal tissue is selected from epithelial tis- 
sue in colon and/or rectum. 

5. The method according to claim 4, wherein the animal tissue is mucosa. 



35 
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6. The method of any of the preceding claims, wherein the biological condition is 
an adenocarcinoma, a carcinoma, a teratoma, a sarcoma, and/or a lymphoma. 

7. The method of any of the preceding claims, wherein the sample is a biopsy of 
5 the tissue. 



8. The method according to any of the preceding claim 1-6, wherein the sample is 
a cell suspension made from the tissue. 

9. The method according to any of the preceding claims, wherein the sample com- 
1 0 prises substantially only cells from said tissue. 



10. The method according to claim 9, wherein the sample comprises substantially 
only cells from mucosa. 

15 11. The method according to any of the claims 3-1 0, wherein the gene from the first 
gene group is selected individually from genes comprising a sequence as identi- 
fied below 



RC H04768 at 



chrom 15 no homology 



RC_Z39652_at 

RC_H30270_at 

RCJT47089_s_at 

RC_W31906_at 

RC_AA279803_at 

RC_R01646_at 

RC_AA099820_at 
AA319615_at 

H07011_at 

RC_T68873_f_at 

RC_T40995_f_at 

RC_H81070JLat 

RC_N30796_at 

RC_W37778_f_at 

RC_R70212_s_at 

RC AA426330 at 



Y14593 APM-1 gene adipocyte-specific secretory protein; 
chrom 1q21.3-q23 

chrom 18 PAAAA in colon & bladder no homology 
tenascin-X; tenascin-X precursor; unidentified protein 
secretagogin; dJ501N12.8 (putative protein) chrom 6 
chrom 2 no homology 

chrom 13q32.1-33.3 ; AL159152 ; homology to mouse 
Pcbpl - poly(rC)-binding protein 1 
BAC clone AC016778 

secretory carrier membrane protein; secretory carrier mem- 
brane protein 2; chrom 15 

tetraspan NET-6 mRNA; transmembrane 4 superfamily; 
chrom 7 
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RC_N33927_s_at 
RC_T90190_s_at 
RC_AA447145_at 
RC_H75860_at 
RC T71132 s at 



wherein the notation refers to Accession No. in the database UniGene (Build 

18). 



5 12. The method according to claim 11, wherein the gene from the first gene group is 
selected individually from genes comprising a sequence as identified below 



RC H04768 at 



chr om 1 5 no homology 



RC_Z39652_at 

RC_H30270_at 

RC_T47089_s_at 

RC_W31906_at 

RC_AA279803_at 

RC_R01646_at 

RC_AA099820_at 
AA319615_at 

H07011 at 



Y14593 APM-1 gene adipocyte-specific secretory protein; 
chrom 1q21.3-q23 

chrom 18 PAAAA in colon & bladder no homology 
tenascin-X; tenascin-X precursor; unidentified protein 
secretagogin; dJ501N12.8 (putative protein) chrom 6 
chrom 2 no homology 

chrom 13q32. 1-33.3 ; AL159152 ; homology to mouse 
Pcbpl - poly(rC)-binding protein 1 
BAC clone AC01 6778 

secretory carrier membrane protein; secretory carrier mem- 
brane protein 2; chrom 15 

tetraspan NET-6 mRNA; transmembrane 4 superfamily; 
chrom 7 



10 



wherein the notation refers to Accession No. in the database UniGene (Build 



18). 



13. The method according to claim 12, wherein the gene from the first gene group is 
selected individually from genes comprising a sequence as identified below 



RC_H04768_at chrom 15 no homology 



RC_Z39652_at Y14593 APM-1 gene adipocyte-specific secretory protein; 

chrom 1q21.3-q23 

RCJH30270_at chrom 18 PAAAA in colon & bladder no homology 

RC_T47089_s_at tenascin-X; tenascin-X precursor; unidentified protein 
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RC_W31906_at 

RC_AA279803_at 

RC_R01646_at 

AA319615 at 



secretagogin; dJ501N12.8 (putative protein) chrom 6 
chrom 2 no homology 

chrom 13q32.1-33.3 ; AL159152 ; homology to mouse 
Pcbpl - poly(rC)-binding protein 1 

secretory carrier membrane protein; secretory carrier mem- 
brane protein 2; chrom 15 



wherein the notation refers to Accession No. in the database UniGene (Build 



18). 



14. The method according to claim 13, wherein the gene from the first gene group is 
selected individually from genes comprising a sequence as identified below 



RC_T47089__s_at tenascin-X; tenasci n-X prec ursor; unidentified p rotein 

RC_W31906_at secretagogin; dJ501N12.8 (putative protein) chrom 6 

RC_AA279803_at chrom 2 no homology 

AA31 961 5_at secretory carrier membrane protein; secretory carrier mem- 
brane protein 2; chrom 15 

wherein the notation refers to Accession No. in the database UniGene (Build 18) 



15. The method according to any of claims 3-14, wherein the second gene group 
are selected individually from genes comprising a sequence as identified below 



RC 


AA609013 


s at 


microsomal dipeptidase (also on 6.8k); chrom 16 


RC. 


_AA232508_ 


.at 


CGI-89 protein; unnamed protein product; hypothetical 
protein 


RC. 


_AA428964_ 


.at 


serine protease-like protease; serine protease homo- 
log=NES1; normal epithelial cell-specific 1 


RC 


_T52813_s_ 


at 


dJ28O10.2 (G0S2 (PUTATIVE LYMPHOCYTE G0/G1 
SWITCH PROTEIN 2; chrom 1 


RC 


AA075642 


at 


gp-340 variant protein; DMBT1/8kb.2 protein 


RC 


AA007218 


.at 


chrom 13 no homology 


RC. 


_N33920_at 




ubiquitin-like protein FAT10; diubiquitin; dJ271M21.6 (Di- 
ubiquitin); chrom 6 


RC 


N71781 at 




KIAA1 199 protein, chrom 15 


RC. 


_R67275_s_ 


at 


alpha-1 (type XI) collagen precursor; collagen, type XI, 
alpha 1; collagen type XI alp 
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RC_W80763_at 

RC_AA443793_at 

RC_AA034499_s_at 

RC_AA035482_at 
RC_AA024482_at 
RC_H93021_at 

RC_AA427737_at 

RC_AA417078_at 

M29873_s_at 

RC_H27498_f_at 

RC_T92363_s_at 

RC_N89910_at 

RC_W60516_at 

RC_AA219699_at 

RC AA449450 at 



ha-1 isoform A; chrom 1 
hypothetical protein; chrom 17 
chrom 7p22 AC006028 BAC clone 
ZNF198 protein; zinc finger protein; FIM protein; Cys-rich 
protein; zinc finger protein 198; chrom 13 
chrom 5; AK022505 clone; CalcineurinB (weakly similar) 
hypothetical protein; unnamed protein product; chrom 17 
chrom 2 ; XM_004890 peptidylprolyl isomerase A (cy- 
clophilin A) 
no homology 

chrom 7q31; AF017104 clone 
cytochrome P450-IIB (hllB3) ; 19q13.1-q13.2 



wherein the notation refers to Accession No. in the database UniGene (Build 



18). 



16. The method according to any of claims 3-15, wherein the second gene group 
are selected individually from genes comprising a sequence as identified below 



RC_AA 60 9 013_s_at microsomal dipeptidase (also on 6.8k); chrom 16 



RC_AA232508_at 

RC_AA428964_at 

RC_T52813_s_at 

RC_AA075642_at 
RC_AA007218_at 
RC_N33920_at 

RC_N71781_at 
RC R67275 s at 



CGI-89 protein; unnamed protein product; hypothetical 
protein 

serine protease-like protease; serine protease homo- 
log=NES1; normal epithelial cell-specific 1 
dJ28O10.2 (G0S2 (PUTATIVE LYMPHOCYTE G0/G1 
SWITCH PROTEIN 2; chrom 1 
gp-340 variant protein; DMBT1/8kb.2 protein 
chrom 13 no homology 

ubiquitin-like protein FAT 10; diubiquitin; dJ271M21.6 (Di- 
ubiquitin); chrom 6 
KIAA1199 protein, chrom 15 

alpha-1 (type XI) collagen precursor; collagen, type XI, 
alpha 1; collagen type XI alpha-1 isoform A; chrom 1 
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RC_W80763_at 

RC_AA443793_at 

RC_AA034499_s_at 

RC_AA035482_at 
RC_AA024482_at 
RC_H93021_at 

RC_AA427737_at 
RC_AA417078_at 
M29873 s at 



hypothetical protein; chrom 17 
chrom 7p22 AC006028 BAC clone 

ZNF198 protein; zinc finger protein; FIM protein; Cys-rich 
protein; zinc finger protein 198; chrom 13 
chrom 5; AK022505 clone; CalcineurinB (weakly similar) 
hypothetical protein; unnamed protein product; chrom 17 
chrom 2 ; XM_004890 peptidylprolyl isomerase A (cy- 
clophilin A) 
no homology 

chrom 7q31; AF017104 clone 
cytochrome P450-IIB (hllB3) ; 19q13.1-q13.2 



wherein the notation refers to Accession No. in the database UniGene (Build 



18). 



17. The method according to any of claims 3-14, wherein the second gene group 
are selected individually from genes comprising a sequence as identified below 



RC 


AA609013 


s at 


RC 


_AA232508_ 


.at 


RC 


_AA428964_ 


at 


RC 


AA075642 


at 


RC 


AA007218 


at 


RC. 


_N33920_at 




RC 


N71781 at 




RC. 


_R67275_s_ 


at 


RC 


_W80763_at 


RC. 


_AA034499_ 


s_at 


RC 


AA035482 


at 


RC 


AA024482 


at 


RC. 


_H93021_at 




RC 


AA427737 


at 



microsomal dipeptidase (also on 6.8k); chrom 16 



CGI-89 protein; unnamed protein product; hypothetical 
protein 

serine protease-like protease; serine protease homo- 
log=NES1; normal epithelial cell-specific 1 
gp-340 variant protein; DMBT1/8kb.2 protein 
chrom 13 no homology 

ubiquitin-like protein FAT 10; diubiquitin; dJ271M21.6 (Di- 
ubiquitin); chrom 6 
KIAA1199 protein, chrom 15 

alpha-1 (type XI) collagen precursor; collagen, type XI, 
alpha 1; collagen type XI alpha-1 isoform A; chrom 1 
hypothetical protein; chrom 17 

ZNF198 protein; zinc finger protein; FIM protein; Cys-rich 
protein; zinc finger protein 198; chrom 13 
chrom 5; AK022505 clone; CalcineurinB (weakly similar) 
hypothetical protein; unnamed protein product; chrom 17 
chrom 2 ; XM_004890 peptidylprolyl isomerase A (cy- 
clophilin A) 
no homology 
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RC_AA417078_at chrom 7q31; AF017104 clone 

M29873_s,at cytochrome P450-HB (MIB3) ; 19q13.1-q13.2 

wherein the notation refers to Accession No. in the database UniGene (Build 

18). 

5 18. The method according to any of claims 3-17, wherein the second gene group 
comprises a sequence as identified below 



RC_W80763_at | [hypothetical protein; chrom 17 



wherein the notation refers to Accession No. in the database UniGene (Build 

10 18). 

19. The method according to any of the preceding claims, wherein the expression 
level of at least two genes from the first gene group are determined. 

15 20. The method according to any of the preceding claims, wherein the expression 
level of at least two genes from the second gene group are determined. 

21. The method according to any of the preceding claims, further comprising the 
steps of determining the stage of a biological condition in the animal tissue, 

20 comprising assaying a third expression level of at least one gene from a third 

gene group, wherein a gene from said second gene group, in one stage, is ex- 
pressed differently from a gene from said third gene group. 

22. The method according to any of the preceding claims, wherein the difference in 
25 expression level of a gene from one group to the expression level of a gene from 

another group is at least two-fold. 



23. The method according to any of the preceding claims, wherein the difference in 
30 expression level of a gene from one group to the expression level of a gene from 

another group is at least three-fold. 
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24. The method according to any of the preceding claims, wherein the expression 
level is determined by determining the mRNA of the cells. 

25. The method according to any of the claims 1-23, wherein the expression level is 
5 determined by determining expression products, such as peptides, in the cells. 

26. The method according to claim 25, wherein the expression level is determined 
by determining expression products, such as peptides, in the body fluids, such 
as blood, serum, plasma, faeces, mucus, sputum, cerebrospinal fluid, and/or 

10 urine. 

27. A method of determining the stage of a biological condition in animal tissue, 
comprising collecting a sample comprising cells from the tissue, 

15 

assaying the expression of at least a first stage gene from a first stage gene 
group and/or at least a second stage gene from a second stage gene group, 
wherein at least one of said genes is expressed in said first stage of the condi- 
tion in a higher amount than in said second stage, and the other gene is a ex- 
20 pressed in said first stage of the condition in a lower amount than in said second 

stage of the condition, 

correlating the expression level of the assessed genes to a standard level of ex- 
pression determining the stage of the condition. 

25 

28. The method according to claim 27, wherein the tissue is selected from the 
epithelial tissue in colon or rectum. 

29. The method according to any of the preceding claims 27-28, wherein the differ- 
30 ence in expression levels between a gene from one group to a gene from an- 
other group is at least one-fold. 

30. The method according to any of the preceding claims 27-29, wherein the differ- 
ence in expression levels between a gene from one group to a gene from an- 

35 other group is at least two-fold. 
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31. The method according to claim 27, wherein the stage is selected from colon 
cancer stages Dukes A, Dukes B, Dukes C, and Dukes D. 



32. The method according to claim 31, comprising assaying at least the expression 
5 of Dukes A stage gene from a Dukes A stage gene group, at least one Dukes B 

stage gene from a Dukes B stage gene group, at least the expression of Dukes 
C stage gene from a Dukes C stage gene group, and at least one Dukes D 
stage gene from a Dukes D stage gene group, wherein at least one gene from 
each gene group is expressed in a significantly different amount in that stage 
1 0 than in one of the other stages. 



33. The method according to claim 32, wherein at least one gene from each gene 
group is expressed in a significantly higher amount in that stage than in one of 
the other stages. 

15 

34. The method according to claim 33, wherein a Dukes A stage gene is selected 
individually from any gene comprising a sequence as identified below 



RC_AA599199_at ALU seq . 



RC_R12694_at unnamed protein product BAA91641, chrom 10 

RCJH91325_s_at aldolase B; aldolase B (aa 1-364); chrom 9 

RCJM51709_at chrom X 
RCJM72610_at 

RC_N69263_at chrom 10; AK026414 clone (only 108 nt horn) 

RC_T15817_f_at iNOS, inducible nitric oxide synthase 

20 wherein the notation refers to Accession No. in the database UniGene (Build 

18). 



35. The method according to claim 33, wherein a Dukes B stage gene is selected 
individually from any gene comprising a sequence as identified below 

25 

RC__T67463_s_at cathepsinQ2; X; K ~~ ~~ 

RC_W94688_at ~ perilipin 

RC_AA126743_at Z97200 PAC chrom 1q24; 
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PMX1 homeobox gene 
RC_AA236547_at no homology 

RC_AA255567_at angiopoietin-related protein-2; angiopoietin-like 2 

RC_AA421256_at 

RC_AA386386_s_at PPPPP - 

RC_AA452549_at PPPPP PRQ1659; hypothetical protein chrom 1 1 

wherein the notation refers to Accession No. in the database UniGene (Build 

18). 

5 36. The method according to claim 33, wherein a Dukes C stage gene is selected 
individually from any gene comprising a sequence as identified below 



RC__D45556_at chrom 15; AL390085 clone 

RC_W86214_at 

RC_AA039439_s_at novel gene KIAA0134 protein 19q13.3 

RC_AA128935_at 

RC_AA134158_s_at class I homeodomain; homeobox protein, chrom 7 

RC_AA232646_at chrom 17, AF266756 sphingosine kinase (SPHK1 

RC_AA401 1 84_at no homology 

RC_AA436840_at 

RC_AA488655_at 

RC_AA181902_at PPPPP AC007201 on chrom 19 (only 80nt horn) 

wherein the notation refers to Accession No. in the database UniGene (Build 

10 18). 



37. The method according to claim 33, wherein a Dukes D stage gene is selected 
individually from any gene comprising a sequence as identified below 



RC_N91920_at AAAAP chrom 16p12-p11.2 ; XNJD07994 retinoblastoma bin- 

ding protein 

R*C_AA621 60 1 _at AAAAP chrom 17 XM_009868 RAB36 ARS oncogene family 

15 



20 
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10 



wherein the notation refers to Accession No, in the database UniGene (Build 
18). 

38. The method according to claim 32, wherein at least one gene from each gene 
group is expressed in a significantly lower amount in that stage than in one of 
the other stages. 

39. The method according to claim 38, wherein a Dukes A stage gene is selected 
individually from any gene comprising a sequence as identified below 



RCJM3241 1_f_at PAPPP Myc-associated zinc-finger protein of human islet; 

_ chrom 16 _____ _ 

RC_AA243858_at PAPPP KIAA0882 protein 

RC_AA486283_at PAPPP ras-like protein; ras-related C3 botulinum toxin sub- 
strate; dJ20J23 
RC_AA490930_at PAPPP chrom 18; KIAA1 468 protein 
RC_H54088_s_at PPPPP ribosomal protein L41 

RC_H59052_f_at PPPPP fungal sterol-C5-desaturase homolog; ORF; thymosin 

beta-4 

RC_R49198_s_at PPPPP - 

RC_T73572_f_at PPPPP ferritin L-chain; L apoferritin 

RC_AA477483_at PPPPP no matching est 

wherein the notation refers to Accession No. in the database UniGene (Build 

18). 



15 40. The method according to claim 38, wherein a Dukes B stage gene is selected 
individually from any gene comprising a sequence as identified below 



RC_D59847_at PPAPP proSAAS; granin-like neuroendocrine peptide pre- 

____ c ursor 

RC_F05038_at PPAPP polyamine modulated factor-1 ; polyamine modulated 

factor 1 

RC N41059 at PPAPP chrom 3 
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RC 


_T23460 at 




PPAPP 


chrom 3; IFNAR2 21q22.1 1 


RC] 


_W42789_at 


PPAPP 


chrom 8 AF268037 C80RF4 protein (C80RF4) 










chrom 8 ORF 


RC 


AA460017 


i at 


PPAPP 


BAC clone chrom 16 


RC 


AA482127 


at 


PPAPP 


KIAA1142 protein 


RC 


AA504806 


at 


PPAPP 


chrom 2 AF052107 clone 23620 mRNA sequence 


RC 


T90037 at 




PPPPP 


unnamed protein product, chrom 4 


RC 


AA432130 


at 


PPPPP 


KIAA0867 protein, chrom 12 



wherein the notation refers to Accession No. in the database UniGene (Build 

18). 



5 41 . The method according to claim 38, wherein a Dukes C stage gene is selected 
individually from any gene comprising a sequence as identified below 



RC. 


_N30231_at 




PPPAP 


Lsm4 protein; U6 snRNA-associated Sm-like protein 
LSm4; glycine-rich protein 


RC. 


_W73790_f_ 


at 


PPPAP 


immunoglobulin-related protein 14.1; lambda L-chain 
C region; omega protein, chrom 22 


RC. 


_AA412184_ 


at 


PPPAP 


chrom 1p36; d89060 dolichyl- 

diphosphooligosaccharide-protein glycosyltransferase 


RC. 


_AA521303_ 


at 


PPPAP 


methionine adenosyltransferase regulatory beta subu- 
nit; dTDP-4-keto-6-deoxy-D-glucose 4-reductase, 
chrom 5 


RC. 


_AA461174_ 


at 


PPPPP 


8p21 .3-p22 AB020860 anti-oncogene 


AA393432_s_at 


PPPPP 


chrom 2, Unknown; unnamed protein product A- 










AD20029 



10 wherein the notation refers to Accession No. in the database UniGene (Build 

18). 
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42. The method according to claim 38, wherein a Dukes D stage gene is selected 
individually from any gene comprising a sequence as identified below 



RC_,R72886_s_at PPPPA KIAA0422; adenylyl cyclase type VI, chrom 12 
RC_AA026030_at PPPPA chrom 1 

RC_Z39006_at PPPPA hypothetical protein, chrom 17 

RC_AA435908_at PPPPA chrom 19; ac011491 clone and 20 nt horn. RAB2, 

RAS oncogene family 
RC_AA057829_s_at PPPPA growth-arrest-specific protein; growth arrest-specific 

6; AXL stimulatory factor, chrom 13 
RC_R72087_at PPPPA chrom 5 EST; horn to chrom 20 AL356652 clone 

RC_H04242_at PPPPA ras related protein Rab5b; RAB5B, member RAS 

oncogene family 

RC_R97304_f_at PPPPA HLA-drb5; cell surface glycoprotein; MHC HLA-DR- 

beta chain precursor chrom 6 
RC_N48609_at PPPPA chrom 1 1 ; AC004584 chrom 1 7 

RCJA/86850_f_at PPPPA chrom 22 ? X96924 mitochondrial citrate tranbsport 

region 

RC_AA130603_at PPPPA ak024908 clone 
RC_AA479610_at PPPPA singleton ak025344 clone 

RC_AA490593_Lat PPPPA chrom 17 ? Synaptobrevin2 (VAMP2) AF1 35372 
RC_AA054321_s_at PPPPA 6p21 HLA class i region; AC004202 clone 
RC_D60328_at PPPPP chrom 6, unknown; ring finger protein 5 

RC_H96850_at PPPPP oligosaccharyltransferase d89060 1p36.1 (also C- 

class) 

RC_AA127444_at PPPPP chrom 1 no homology 
RC_AA242824_at PPPPP chrom 1 1 ; ac005233 PAC clone chrom 22 
AA405775_s_at PPPPP similar to CAA16821 (PID:g3255952) 

5 

wherein the notation refers to Accession No. in the database UniGene (Build 

18). 
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43. A method of determining an expression pattern of a colon cell sample, compris- 
ing: 

5 collecting sample comprising colon and/or rectum cells and/or expression prod- 

ucts from colon and/or rectum cells, 

determining the expression level of two or more genes in the sample, wherein at 
least one gene belongs to a first group of genes, said gene from the first gene 

10 group being expressed in a higher amount in normal tissue than in biological 

condition cells, and wherein at least one other gene belongs to a second group 
of genes, said gene from the second gene group being expressed in a lower 
amount in normal tissue than in biological condition cells, and the difference 
between the expression level of the first gene group in normal cells and biologi- 

15 cal condition cells being at least two-fold, obtaining an expression pattern of the 

colon and/or rectum cell sample. 

44. The method of claim 43, wherein the two or more genes exclude genes which 
are expressed in the submucosal, muscle, or connective tissue, whereby a pat- 

20 tern of expression is formed for the sample which is independent of the propor- 

tion of submucosal, muscle, or connective tissue cells in the sample. 

45. The method of claim 44, comprising determining the expression level of one or 
more genes in the sample comprising predominantly submucosal, muscle, and 

25 connective tissue cells, obtaining a second pattern, subtracting said second 

pattern from the expression pattern of the colon and/or rectum cell sample, 
forming a third pattern of expression, said third pattern of expression reflecting 
expression of the colorectal mucosa or colorectal cancer cells independent of 
the proportion of submucosal, muscle, and connective tissue cells present in the 

30 sample. 

46. The method of any of the preceding claims 43-45, wherein the sample is a bi- 
opsy of the tissue. 
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47. The method according to any of the preceding claim 43-46, wherein the sample 
is a cell suspension. 

48. The method according to any of the preceding claims 43-47, wherein the sample 
5 comprises substantially only cells from said tissue. 

49. The method according to claim 48, wherein the sample comprises substantially 
only cells from mucosa. 

10 50. The method according to any of the claims 43-47, wherein the gene from the 
first gene group is selected individually from 



RC H04768 at 



chrom 15 no homology 



RC_Z39652_at 

RC_H30270_at 

RC_T47089_s_at 

RC_W31906_at 

RC_AA279803_at 

RC_R01646_at 

RC_AA099820_at 
AA319615_at 

H07011_at 

RC_T68873_f_at 

RC_T40995_f_at 

RC_H81070_f_at 

RC_N30796_at 

RC_W37778_f_at 

RC_R70212_s_at 

RC_AA426330_at 

RC_N33927_s_at 

RC_T90190_s_at 

RC_AA447145_at 

RC_H75860_at 

RC T71132 s at 



Y14593 APM-1 gene adipocyte-specific secretory protein; 
chrom 1q21.3-q23 

chrom 18 PAAAA in colon & bladder no homology 
tenascin-X; tenascin-X precursor; unidentified protein 
secretagogin; dJ501N12.8 (putative protein) chrom 6 
chrom 2 no homology 

chrom 13q32. 1-33.3 ; AL159152 ; homology to mouse 
Pcbpl - poly(rC)-binding protein 1 
BAC clone AC01 6778 

secretory carrier membrane protein; secretory carrier mem- 
brane protein 2; chrom 15 

tetraspan NET-6 mRNA; transmembrane 4 superfamily; 
chrom 7 



SUBSTITUTE SHEET (RULE 26) 



WO 01/49879 



146 



PCT/DK00/00744 



wherein the notation refers to Accession No. in the database UniGene (Build 



18). 



51. The method according to claim 50, wherein the gene from the first gene group is 
selected individually from genes comprising a sequence as identified below 



RC H04768 at 



chrom 15 no homolog y_ 



RC_Z39652_at 

RC_H30270_at 

RC_T47089_s_at 

RC_W31906_at 

RC_AA279803_at 

RC__R01646_at 

RC_AA099820_at 
AA319615_at 

H07011 at 



Y14593 APM-1 gene adipocyte-specific secretory protein; 
chrom 1q21.3-q23 

chrom 18 PAAAA in colon & bladder no homology 
tenascin-X; tenascin-X precursor; unidentified protein 
secretagogin; dJ501N12.8 (putative protein) chrom 6 
chrom 2 no homology 

chrom 13q32. 1-33.3 ; AL159152 ; homology to mouse 
Pcbpl - poly(rC)-binding protein 1 
BAC clone AC01 6778 

secretory carrier membrane protein; secretory carrier mem- 
brane protein 2; chrom 15 

tetraspan NET-6 mRNA; transmembrane 4 superfamily; 
chrom 7 



10 



wherein the notation refers to Accession No. in the database UniGene (Build 



18). 



52. The method according to claim 51 , wherein the gene from the first gene group is 
selected individually from genes comprising a sequence as identified below 



RC H04768 at 



chrom 15 no homology 



RC_Z39652_at 

RC_H30270__at 

RC_T47089_s_at 

RC_W31906_at 

RC_AA279803_at 

RC_R01646_at 

AA319615 at 



Y14593 APM-1 gene adipocyte-specific secretory protein; 
chrom 1q21.3-q23 

chrom 18 PAAAA in colon & bladder no homology 
tenascin-X; tenascin-X precursor; unidentified protein 
secretagogin; dJ501N12.8 (putative protein) chrom 6 
chrom 2 no homology 

chrom 13q32. 1-33.3 ; AL159152 ; homology to mouse 
Pcbpl - poly(rC)-binding protein 1 
secretory carrier membrane protein; secre 
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tory carrier membrane protein 2; chrom 15 

wherein the notation refers to Accession No. in the database UniGene (Build 

18). 

5 53. The method according to claim 52, wherein the gene from the first gene group is 
selected individually from genes comprising a sequence as identified below 



RC T47089 s at 


tenascin-X; tenascin-X precursor; unidentified protein 


RC W31906 at 
RC AA279803 at 
AA319615 at 


secretagogin; dJ501N12.8 (putative protein) chrom 6 
chrom 2 no homology 

secretory carrier membrane protein; secretory carrier mem- 
brane protein 2; chrom 15 



wherein the notation refers to Accession No. in the database UniGene (Build 

10 18). 



54. The method according to any of claims 3-14, wherein the second gene group 
are selected individually from genes comprising a sequence as identified below 



RC 


AA609013 


s at 


microsomal dipeptidase (also on 6.8k) ; chrom 16 


RC 


_AA232508_ 


.at 


CGI-89 protein; unnamed protein product; hypothetical 
protein 


RC. 


_AA428964_ 


.at 


serine protease-like protease; serine protease homo- 
log=NES1 ; normal epithelial cell-specific 1 


RC. 


_T52813_s_ 


at 


dJ28O10 2 (G0S2 (PUTATIVE LYMPHOCYTE G0/G1 
SWITCH PROTEIN 2; chrom 1 


RC 


AA075642 


at 


gp-340 variant protein; DMBT1/8kb.2 protein 


RC 


AA007218 


.at 


chrom 13 no homology 


RC. 


_N33920_at 




ubiquitin-like protein FAT 10; diubiquitin; dJ271M21.6 (Di- 
ubiquitin); chrom 6 


RC 


N71781 at 




KIAA1 199 protein, chrom 15 


RC. 


_R67275_s_ 


at 


alpha-1 (type XI) collagen precursor; collagen, type XI, 
alpha 1; collagen type XI alpha-1 isoform A; chrom 1 


RC 


_W80763_at 


hypothetical protein; chrom 17 


RC 


AA443793 


at 


chrom 7p22 AC006028 BAC clone 


RC 


_AA034499_ 


_s_at 


ZNF198 protein; zinc finger protein; FIM protein; Cys-rich 
protein; zinc finger protein 
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RC_AA035482_at 
RC_AA024482_at 
RC_H93021_at 

RC_AA427737_at 

RC_AA417078_at 

M29873_s_at 

RC_H27498_f_at 

RC_T92363_s_at 

RC_N89910_at 

RC_W60516_at 

RC_AA219699_at 

RC AA449450 at 



198; chrom 13 

chrom 5; AK022505 clone; CalcineurinB (weakly similar) 
hypothetical protein; unnamed protein product; chrom 17 
chrom 2 ; XM_004890 peptidylprolyl isomerase A (cy- 
clophilin A) 
no homology 

chrom 7q31; AF017104 clone 
cytochrome P450-IIB (hllB3) ; 19q13.1-q13.2 



wherein the notation refers to Accession No. in the database UniGene (Build 



18). 



55. The method according to any of claims 43-49, wherein the second gene group 
are selected individually from genes comprising a sequence as identified below 



RC 


AA609013 


s at 


microsomal dipeptidase (also on 6.8k); chrom 16 


RC. 


_AA232508_ 


.at 


CGI-89 protein; unnamed protein product; hypothetical 
protein 


RC. 


_AA428964_ 


at 


serine protease-like protease; serine protease homo- 
log=NES1; normal epithelial cell-specific 1 


RC. 


_T52813_s_ 


at 


dJ28O10.2 (G0S2 (PUTATIVE LYMPHOCYTE G0/G1 
SWITCH PROTEIN 2; chrom 1 


RC 


AA075642 


at 


gp-340 variant protein; DMBT1/8kb.2 protein 


RC 


AA007218 


at 


chrom 13 no homology 


RC. 


_N33920_at 




ubiquitin-like protein FAT10; diubiquitin; dJ271M21.6 (Di- 
ubiquitin); chrom 6 


RC 


N71781 at 




KIAA1199 protein, chrom 15 


RC. 


_R67275_s_ 


at 


alpha-1 (type XI) collagen precursor; collagen, type XI, 
alpha 1; collagen type XI alpha-1 isoform A; chrom 1 


RC 


_W80763_at 


hypothetical protein; chrom 17 


RC 


AA443793 


at 


chrom 7p22 AC006028 BAC clone 


RC 


_AA034499_ 


_s_at 


ZNF198 protein; zinc finger protein; FIM protein; Cys-rich 
protein; zinc finger protein 198; chrom 13 
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RC AA035482 


at 


chrom 5; AK022505 clone; CalcineurinB (weakly similar) 


RC AA024482 


at 


hypothetical protein; unnamed protein product; chrom 17 


RC_H93021_at 




chrom 2 ; XM_004890 peptidylprolyl isomerase A (cy- 






clophilin A) 


RC AA427737 


at 


no homology 


RC AA4 17078 


at 


chrom 7q31; AF017104 clone 


M29873 s at 




cytochrome P450-IIB (hllB3) ; 19q13.1-q13.2 



wherein the notation refers to Accession No. in the database UniGene (Build 

18). 

5 

56. The method according to any of claims 43-49, wherein the second gene group 
are selected individually from genes comprising a sequence as identified below 



RC 


AA609013 


s at 


microsomal dipeptidase (also on 6.8k); chrom 16 


RC 


_AA232508_ 


at 


CGI-89 protein; unnamed protein product; hypothetical 
protein 


RC 


_AA428964_ 


at 


serine protease-like protease; serine protease homo- 
log=NES1; normal epithelial cell-specific 1 


RC 


AA075642 


at 


gp-340 variant protein; DMBT1/8kb.2 protein 


RC 


AA007218 


at 


chrom 13 no homology 


RC. 


_N33920_at 




ubiquitin-like protein FAT 10; diubiquitin; dJ271M21.6 (Di- 
ubiquitin); chrom 6 


RC 


N71781 at 




KIAA1199 protein, chrom 15 


RC. 


_R67275_s_ 


at 


alpha-1 (type XI) collagen precursor; collagen, type XI, 
alpha 1; collagen type XI alpha-1 isoform A; chrom 1 


RC 


_W80763_at 


hypothetical protein; chrom 17 


RC. 


_AA034499_ 


s_at 


ZNF198 protein; zinc finger protein; FIM protein; Cys-rich 
protein; zinc finger protein 198; chrom 13 


RC 


AA035482 


at 


chrom 5; AK022505 clone; CalcineurinB (weakly similar) 


RC 


AA024482 


at 


hypothetical protein; unnamed protein product; chrom 17 


RC. 


_H93021_at 




chrom 2 ; XM_004890 peptidylprolyl isomerase A (cy- 
clophilin A) 


RC 


AA427737 


at 


no homology 


RC. 


_AA417078_ 


at 


chrom 7q31; AF017104 clone 


M29873 s at 




cytochrome P450-IIB (hllB3) ; 19q13.1-q13.2 
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wherein the notation refers to Accession No. in the database UniGene (Build 

18). 

5 57. The method according to any of claims 43-49, wherein the second gene group 
comprises a sequence as identified below 



RC_W80763_at | [Hypothetical protein; chrom 17 



wherein the notation refers to Accession No. in the database UniGene (Build 

10 18). 

58. The method according to any of the preceding claims 43-57, wherein the ex- 
pression level of at least two genes from the first gene group are determined 



15 59. The method according to any of the preceding claims 43-58, wherein the ex- 
pression level of at least two genes from the second gene group are determined. 



60. A method of determining an expression pattern of a colon cell sample independ- 
ent of the proportion of submucosal, muscle, or connective tissue cells present, 
20 comprising: 



determining the expression of one or more genes in a sample comprising cells, 
wherein the one or more genes exclude genes which are expressed in the sub- 
mucosal, muscle, or connective tissue, whereby a pattern of expression is 
25 formed for the sample which is independent of the proportion of submucosal, 

muscle, or connective tissue cells in the sample. 

61. The method according to claim 60, comprising determining the expression level 
of one or more genes in the sample comprising predominantly submucosal, 
30 muscle, and connective tissue cells, obtaining a second pattern, subtracting said 

second pattern from the expression pattern of the colon and/or rectum cell sam- 
ple, forming a third pattern of expression, said third pattern of expression re- 
flecting expression of the colon cells independent of the proportion of submu- 
cosal, muscle, and connective tissue cells present in the sample. 

35 



SUBSTITUTE SHEET (RULE 26) 



WO 01/49879 PCT/DK00/00744 

151 

62. A method of determining the presence or absence of a biological condition in 
human colon and/or rectum tissue comprising, 

collecting a sample comprising cells from the tissue, 

5 

determining an expression pattern of the cells as defined in any of claims 43-61, 
correlating the determined expression pattern to a standard pattern, 
1 0 determining the presence or absence of the biological condition of said tissue. 

63. A method for determining the stage of a biological condition in animal tissue, 
comprising 

15 collecting a sample comprising cells from the tissue, 

determining an expression pattern of the cells as defined in any of claims 43-61 , 
correlating the determined expression pattern to a standard pattern, 

20 

determining the stage of the biological condition is said tissue. 

64. A method for reducing cell tumorigenicity of a cell, said method comprising 

25 contacting a tumor cell with at least one peptide expressed by at least one gene 
selected from genes being expressed in an at least two-fold higher in normal cells 
than the amount expressed in said tumor cell. 

65. The method according to claim 64, wherein the at least one gene is selected 
30 individually from genes comprising a sequence as identified below 



RC 


H04768 


at 


chrom 15 no homology 


RC. 


_Z39652_ 


at 


Y14593 APM-1 gene adipocyte-specific secretory protein; 








chrom 1q21.3-q23 


RC 


H30270 


.at 


chrom 18 PAAAA in colon & bladder no homology 


RC. 


_T47089_ 


_s_at 


tenascin-X; tenascin-X precursor; unidenti 
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fied protein 

RC_W31906_at secretagogin; dJ501N12.8 (putative protein) chrom 6 

RC_AA279803_at chrom 2 no homology 

RC_R01646_at chrom 13q32. 1-33.3 ; AL159152 ; homology to mouse 

Pcbpl - poly(rC)-binding protein 1 
AA319615_at secretory carrier membrane protein; secretory carrier mem- 
brane protein 2; chrom 15 

wherein the notation refers to Accession No. in the database UniGene (Build 

18). 

5 66. The method according to claim 64 or 65, wherein the tumor cell is contacted with 
at least two different peptides. 



67. A method for reducing cell tumorigenicity of a cell, said method comprising 



10 obtaining at least one gene selected from genes being expressed in an at least two- 
fold higher in normal cells than the amount expressed in said tumor cell, 



introducing said at least one gene into the tumor cell in a manner allowing 
expression of said gene(s). 

15 

68. The method according to claim 67, where the at least one gene is selected 
individually from genes comprising a sequence as identified below 



RC_ H04768_at chrom 15 no homology 

RC_Z39652_at Y14593 APM-1 gene adipocyte-specific secretory protein; 

chrom 1q21.3-q23 

RC_H30270_at chrom 18 PAAAA in colon & bladder no homology 

RC_T47089_s_at tenascin-X; tenascin-X precursor; unidentified protein 
RC_W31906_at secretagogin; dJ501N12.8 (putative protein) chrom 6 

RC_AA279803_at chrom 2 no homology 

RC_R01646_at chrom 13q32. 1-33.3 ; AL159152 ; homology to mouse 

Pcbpl - poly(rC)-binding protein 1 
AA31 9615_at secretory carrier membrane protein; secretory carrier mem- 
brane protein 2; chrom 15 
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wherein the notation refers to Accession No. in the database UniGene (Build 

18). 

5 69. The method according to claim 67 or 68, wherein at least two different genes are 
introduced into the tumor cell. 

70. A method for reducing cell tumorigenicity of a cell, said method comprising 

10 obtaining at least one nucleotide probe capable of hybridising with at least one gene 
of a tumor cell, said at least one gene being selected from genes being expressed in 
an amount at least one-fold lower in normal cells than the amount expressed in said 
tumor cell, and 

15 introducing said at least one nucleotide probe into the tumor cell in a manner 
allowing the probe to hybridise to the at least one gene, thereby inhibiting 
expression of said at least one gene. 

71. The method according to claim 70, wherein the nucleotide probe is selected 

20 from probes capable of hybridising to a nucleotide sequence comprising a sequence 
as identified below 



RC 


AA609013 


s at 


APPPP 


microsomal dipeptidase (also on 6.8k); chrom 16 


RC 


_AA232508_ 


.at 


APPPP 


CGI-89 protein; unnamed protein product; hypothe- 
tical protein 


RC. 


_AA428964_ 


.at 


APPPP 


serine protease-like protease; serine protease ho- 
molog=NES1; normal epithelial cell-specific 1 


RC. 


_T52813_s_ 


at 


APPPP 


dJ28O10.2 (G0S2 (PUTATIVE LYMPHOCYTE 
G0/G1 SWITCH PROTEIN 2; chrom 1 


RC 


AA075642 


at 


APPPP 


gp-340 variant protein; DMBT1/8kb.2 protein 


RC 


AA007218 


at 


APPPP 


chrom 13 no homology 


RC. 


_N33920_at 




APPPP 


ubiquitin-like protein FAT10; diubiquitin; 
dJ271M21.6 (Diubiquitin); chrom 6 


RC 


N71781 at 




APPPP 


KIAA1199 protein, chrom 15 



SUBSTITUTE SHEET (RULE 26) 



WO 01/49879 



154 



PCT/DK00/00744 



RC, 


_R67275_s_at 


APPPP 


RC 


W80763_at 


APPPP 


RC 


~AA443793_at 


APPPP 


RC. 


_AA034499_s_at 


APPPP 


RC. 


_AA035482_at 


APPPP 


RC. 


_AA024482_at 


APPPP 


RC. 


_H93021_at 


APPPP 


RC 


AA427737 at 


APPPP 


RC 


AA417078 at 


APPPP 


M29873 s_at 


APPPP 


RC 


H27498 f at 


AAPPP 


RC 


T92363 s at 


AAPPP 


RC 


N89910 at 


AAAPP 


RC 


W60516 at 


AAAPP 


RC 


AA219699 at 


AAAPP 


RC 


AA449450 at 


AAAPP 



alpha-1 (type XI) collagen precursor; collagen, type 
XI, alpha 1; collagen type XI alpha-1 isoform A; 
chrom 1 

hypothetical protein; chrom 17 

chrom 7p22 AC006028 BAC clone 

ZNF198 protein; zinc finger protein; FIM protein; 

Cys-rich protein; zinc finger protein 198; chrom 13 

chrom 5; AK022505 clone; CalcineurinB (weakly 

similar) 

hypothetical protein; unnamed protein product; 
chrom 17 

chrom 2 ; XMJD04890 peptidylprolyl isomerase A 
(cyclophilin A) 
no homology 

chrom 7q31; AF017104 clone 
cytochrome P450-IIB (hllB3) ; 19q13.1-q13.2 



wherein the notation refers to Accession No. in the database UniGene (Build 



18). 



72. The method according to claim 70 or 71 , wherein at least two different genes are 
introduced into the tumor cell. 



73. A method for producing antibodies against an expression product of a cell from a 
10 biological tissue, said method comprising the steps of 
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obtaining expression product(s) from at least one gene said gene being expressed 
as defined in any of claims 27-37, 

immunising a mammal with said expression product(s) obtaining antibodies against 
5 the expression product. 

74. A pharmaceutical composition for the treatment of a biological condition 
comprising at least one antibody produced as described in claim 73. 

10 75. A vaccine for the prophylaxis or treatment of a biological condition comprising at 
least one expression product from at least one gene said gene being expressed as 
defined in any of claims 27-37. 

76. The use of a method as defined in any of claims 1-63 for producing an assay for 
15 diagnosing a biological condition in animal tissue. 

77. The use of a peptide as defined in any of claims 64-66 for preparation of a 
pharmaceutical composition for the treatment of a biological condition in animal 
tissue. 

20 

78. The use of a gene as defined in any of claims 67-69 for preparation of a 
pharmaceutical composition for the treatment of a biological condition in animal 
tissue. 

25 79. The use of a probe as defined in any of claims 70-72 for preparation of a 

pharmaceutical composition for the treatment of a biological condition in animal 
tissue. 

80. An assay for determining the presence or absence of a biological condition in 
30 animal tissue, comprising 

at least one first marker capable of detecting a first expression level of at least 
one gene from a first gene group, wherein the gene from the first gene group is 
selected from genes expressed in normal tissue cells in an amount higher than 
35 expression in biological condition cells, 
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at least one second marker capable of detecting a second expression level of at 
least one gene from a second gene group, wherein the second gene group is 
selected from genes expressed in normal tissue cells in an amount lower than 
5 expression in biological condition cells. 

81. The assay according to claim 80, wherein the marker is a nucleotide probe. 

82. The assay according to claim 80, wherein the marker is an antibody. 

10 

83. The assay according to claim 80, wherein the genes are as defined in any of 
claims 11-18, 34-37, and 39-42. 

84. An assay for determining an expression pattern of a colon and/or rectum cell, 
15 comprising at least a first marker and a second marker, wherein the first marker is 

capable of detecting a gene from a first gene group as defined in claim 43, and the 
second marker is capable of detecting a gene from a second gene group as defined 
in claim 43. 

20 85. The assay according to claim 84, wherein the first marker is capable of detecting 
one gene as identified in Table I, and the second marker is capable of detecting 
another gene as identified in Table I. 

86. The assay according to claim 85, comprising at least two markers for each gene 
25 group, 

correlating the first expression level and the second expression level to a standard 
level of the assessed genes to determine the presence or absence of a biological 
condition in the animal tissue. 

30 

87. The assay according to claim 86, wherein the marker is a nucleotide probe 

88. The assay according to claim 86, wherein the marker is an antibody. 
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89. A method for identifying a tissue sample as colo-rectal, comprising subjecting 
the tissue to a method as identified in any of claims 43-61, determining expression 
patterns and comparing the expression patterns determined with expression 
patterns from colo-rectal tissue. 



SUBSTITUTE SHEET (RULE 26) 



